Discussion 1

 Assigned Readings:

Chapter 1. The Roles of Data and Predictive Analytics in Business

Chapter 2. Reasoning with Data

Initial Postings: Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.

Also, provide a graduate-level response to each of the following questions:

  1. Based on what you have read in Chapters 1-2, please explain how data analytics applies to your current or future role?  What value can data analytics bring to your position? Please share your thoughts. Please cite examples according to APA standards.

[Your post must be substantive and demonstrate insight gained from the course material. Postings must be in the student’s own words – do not provide quotes!] 

[Your initial post should be at least 450+ words and in APA format (including Times New Roman with font size 12 and double spaced). Post the actual body of your paper in the discussion thread then attach a Word version of the paper for APA review] 

The Roles of


and Predictive Analytics in Business

Chapter 1

Learning Objectives

Explain how predictive analytics can help in business strategy formulation.

Distinguish structured from unstructured data.

Differentiate units of observation.

Outline a data-generating process.

Describe the primary ways that data analysis is used to aid business performance.

Discriminate between lead and lag information.

Discriminate between active and passive prediction.

Recognize questions pertaining to business strategy that may utilize (active) predictive analytics.


Defining Data & Data Uses in Business


A collection of information


Organized collection of data that firms use for analysis

Business analytics

The use of data analysis to aid in business decision making

Predictive analytics

The use of data analysis to designed to form predictions about future, unknown, events or outcomes


Business Strategy

Plan of action designed by a business practitioner to achieve a business objective

Business objectives include profit maximization, enhanced employee satisfaction, etc.

Examples of action include pricing decisions, advertising campaigns, and methods of employee compensation


Predictive Analytics for Business Strategy

With no data, strong theoretical model is often not enough to predict effective business strategies

Sound theoretical arguments coupled with data becomes a strong tool to predict effective business strategies

Predictive analytics is an ideal complement to create a successful business strategy


Data Features

Structured Data

Data with well-defined units of observation which can be classified and structured in the form of a spreadsheet.

An example:


Data Features
Unstructured Data
Any data that cannot be classified and structured.
An example:


The Unit of Observation
The entity for which information has been collected
Crucial component of structured data
Tells us the way the in which the information in a dataset varies
Answers the questions: What, Where, Who, When?
Four main groupings: cross-sectional data, pooled cross-sectional data, time-series data, and panel data


Data Types
Cross-Sectional Data
Data that provide a snapshot of information at one fixed point in time.
An example:


Data Types
Pooled Cross-Sectional Data
Combination of two or more unrelated cross-sectional data merged into one.
An example


More Data Types
Time-series data:
Data that exhibit only variation in time
An example


More Data Types
Panel data
Same cross-sectional units over multiple points in time
An example


Data Generating Process (DGP)
Data Generating Process
The underlying mechanism that produces the pieces of information contained in a dataset
Steps for DGP
Establish both formal and informal DGP
Understand what variables are important
Create a representative statistical model
Collect and analyze relevant variables and perform simple tests


Basic Uses of Data Analysis for Business
Categories include:
Pattern discovery
Causal inference


Any request for information from a database
Descriptive Statistics
Quantitative measures meant to summarize and interpret properties of a dataset
Pivot Table
A tool for data summarization that enables different views of the underlying dataset.


Pattern Discovery
Any distinct relationship between observations within a dataset
Pattern discovery
The process of identifying distinctive relationships between observations in a dataset
Data mining
Pattern discovery, typically in large datasets


Pattern Discovery
Types of Pattern Discovery

Association analysis
Looking for conditional probabilities to determine relationships between two or more variables
Cluster analysis
Groups of observations according to some measure of similarity
Outlier detection
Small subsets of observations, if they exist, that contain information far different from the vast majority of the observations in the dataset


Examples of Pattern Discovery

Example of Outlier Detection and Cluster Analysis
Example of Association Analysis: Scatter Plot on Profit & Price


Causal Inference
The process of establishing a causal relationship between a variable(s)representing a cause and a variable(s) representing an effect, where a change in the cause variable results in changes in the effect variable
Causal Inference
Direct: A change in the causal variable, X, directly affects change variable Y
Indirect: A change in X causes a change in Y, but only through its impact on a third variable, Z


Use of Causal Inference
Causal inference occurs in two ways:

Causal Inference has two important applications
Using Experimentation
Econometric Models
Campaign Evaluation


Data Analysis for the Past, Present, and Future
Lag information
Information about past outcomes
Typically contains information on key performance indicators (KPIs), or variables that are used to help measure firm performance
Designed to answer the question, “What happened/ What is happening?”
Lag information can be generated by queries, pattern discovery, and causal inference


Examples of Lag Information
Any structured presentation of the information in a dataset
Any structured assessment of variables of interest, typically KPIs, against a given benchmark
A graphical presentation of the current standing and historical trends for variables of interest, typically KPIs


Report Example


Dashboard Example


Scorecard Example


Lead Information
Lead Information
Information that provides insights about the future
Designed to answer the question, “What is going to happen?”
It helps firms in its future planning process with expectations and strategic moves.
Lead information is not generally presented in a standardized format


Predictive Analytics and Lead Information
Predictive analytics is data analysis designed to provide lead information
Two ways predictive analytics can predict the future

Active prediction
Passive Prediction


Passive Prediction
Passive prediction uses predictive analytics to make predictions based on actual or hypothetical data, where no variables are exogenously altered.
Exogenously altered – a variable in a dataset that changes due to factors outside the data-generating process that are independent of all other variables within the data-generating process
Examples: Weather forecasting, prediction about customers likely to drop service etc.
Pattern discovery (data mining) when used to make predictions, is generally used for passive predictions
Model fit – the basis on which analysts choose among competing models for passive prediction


Active Prediction
Active prediction uses predictive analytics to make predictions based on actual or hypothetical data, for which one or more variables are exogenously altered.
Making active predictions need causal relationship between variable ‘X’ and variable ‘Y’.
If change in X affects Y, this occurs due to a causal relationship between the two.


Active Prediction for Business Strategy Formation
Predicting an outcome for alternative strategies requires the application of active prediction
To accurately predict an outcome for a range of competing strategies, you must establish the causal effects of those strategies in that outcome
The leap from correlation to causality is a large one, and can lead to grossly incorrect predictions


Reasoning with Data

Chapter 2

Learning Objectives

Define reasoning.

Execute deductive reasoning.

Explain an empirically testable conclusion.

Execute inductive reasoning.

Differentiate between deductive and inductive reasoning.

Explain how inductive reasoning can be used to evaluate an assumption.

Describe selection bias in inductive reasoning.



What is Reasoning?

Reasoning is the process of forming conclusions, judgments, or inferences from facts or data

Reasoning and logic are often used interchangeably

Logic is a description of the rules and/or steps behind the reasoning process



Two Arguments

Argument 1:

The companies profits are up more than 10% over the past year. An increase in profits of 10% is the result of excellent management. You were the manager over the past year. Therefore, I conclude that you engaged in excellent management last year.

Argument 2:

Ten of your 300 employees came to me with complaints about your management. They indicated that you treated them unfairly by not giving them a raise they deserved. Therefore, I conclude that all of your employees are disgruntled with your management.



Understanding Reasoning

In presenting the two arguments, the goal is not to make a definitive decision about which you believe (if either)

The goal is to think about and distinguish different “lines” of reasoning

In distinguishing between the different types of reasoning, you will be able to establish why you believe or question the claims made in the two arguments



Two Major Types of Reasoning


Deductive Reasoning

Inductive Reasoning

Both play an important role in interpreting and drawing conclusions from data analysis



Deductive Reasoning
Deductive Reasoning

Goes from the general to the specific

Also known as top-down logic

Seeks to prove statements of the form “If A, then B”



Deductive Reasoning

Such reasoning always implies three underlying components: assumptions (“If A”), methods of proof (“then”), and conclusions (“B”)



Deductive Reasoning
The purest applications of deductive reasoning are in the field of mathematics
Two of the most used approaches are direct proofs and transposition
Direct proofs
Proof that begins with assumptions, explains methods of proof, and states the conclusion(s)
Any time a group of assumptions implies a conclusion, then it is also true that any time the conclusion does not hold, at least one of the assumptions must not hold



Direct Proof
Let’s prove the following statement by direct proof:
If X and Y are odd numbers, then their sum (X + Y) is an even number
An Example:
If X = 5(odd) and Y = 9(odd), then their sum X + Y = 14 is an even number
Failing to find a contradiction is not the same a proving a statement is generally true



Direct Proof: A Mathematical Approach
If X and Y are odd numbers, then their sum (X + Y) is an even number
X and Y are odd numbers
If X is an odd number, then X can be written as X=2K+1, where K is an integer. (Example: X=13 X=(2 × 6)+1)
If Y is an odd number, then Y can be written as Y =2M+1, where M is an integer, (Example: Y=23 Y=(2 × 11)+1)
K+M+1 is an integer so X+Y is 2 times an integer
Any number that is 2 times an integer is divisible by 2
This means X+Y is even



Direct Proof: Common Sense Approach
“If McDonald’s offers breakfast all day, their revenues will increase.”
McDonald’s stores offer breakfast all day.
The addition of breakfast during lunch/dinner hours implies more choices.
Customers already choosing McDonald’s during lunch/dinner hours can continue buying the same meals at McDonald’s.
Customers not choosing McDonald’s during lunch/dinner hours may start eating at McDonald’s.
Retaining current customers and adding new ones, McDonald’s revenues will increase overall.



While direct proofs are sufficient to prove a point logically, an alternative approach, transposition, may be more effective
Is the equivalence between the statements “If A, then B” and “If not B, then A”
Any time a group of assumptions implies a conclusion, then it is also true that any time the conclusion does not hold, then at least one of the assumptions must not hold



“If A, then B” AND “If not B, then not A”




Transposition: A Mathematical Approach
Prove the statement: If X2 is even, then X is even
Suppose X is not an even number; it is instead an odd number
If X is an odd number, then X= (2K +1), where K is an integer
X2 = (2K+1)2 = 4K2 + 4K+1.
4K2 + 4K = 4(K2 +K) and so is divisible by 2
4K2 + 4K is an even number
X2 = 4(K2 +K)+1 is an even number plus 1, meaning it is an odd number



The statement was: If X2 is even, then X is even
Using transposition, the opposite of the conclusion is used to proof the opposite of the assumption: If X is odd, then X2 is even would be incorrect
Transposition can also be used without using mathematics to prove statements like “If A, then B”.
Transposition can be particularly effective if an assumption seems indisputably obvious.



Transposition: An Example
Proof the statement: “If McDonald’s stores offer breakfast all day, revenue will increase”
McDonald’s stores revenues will not increase
This means total revenues from current and new customers will not increase
This means either there will be no new customers or revenues from current customers will decrease
This means there could not have been an expansion in the menu
McDonald’s stores do not offer breakfast all day



Direct Proof and Transposition
Direct Proof
State assumptions
Explain methods of proof (mathematics, common sense, etc.)
State conclusions
Assume the opposite of the conclusion
Explain methods of proof (mathematics, common sense, etc.)
State assumption(s) that is (are) violated (not A)



Deductive Reasoning
Used commonly in the application of law
If there is disagreement with a conclusion there are two possible sources:
The method of proof, OR
The assumption
There are two ways of resolving disputes about assumptions
Show robustness- the persistent accuracy of a conclusion despite variation in the associated assumption(s) within the context of a deductive argument
Assess consistency with a collected dataset



Empirically Testable Conclusions
An empirically testable conclusion is a conclusion whose validity can be meaningfully tested using observable data.

A banana company’s management staffs are divided into two groups about their product’s placement in a major grocery store chain.
Group 1 believes that change in current location will increase its sales.
Group 2 believes that current location is good enough.



Empirically Testable Conclusions
Company has the sales data in the current location.
Company chooses to move its product to a new location and collects sales data.
Now the company can meaningfully test the validity of the management’s competing conclusions.
Making the actual decision about the validity of an empirically testable conclusion based on observable data is an application of inductive reasoning



Inductive Reasoning
Inductive reasoning
Reasoning that goes from the specific to the general; bottom-up logic
The entire set of potential observations about which we want to learn
Data sample
A subset of population that is collected and observed



Inductive Reasoning
Business regularly collect data samples to draw conclusions about the population after applying inductive reasoning.
The conclusion from inductive reasoning requires degree of support (also called inductive probability).
Degree of support is also called the strength of the inductive argument.
Example: if we are 50% confident about the conclusion, then the degree of support is 50%.



Degrees of Support
Two Types of Degrees of Support

Both play an important role in interpreting and drawing conclusions from data analysis




Evaluating Assumptions
Through deductive reasoning, an empirically testable conclusion is made
Collect a data sample
Test the conclusion by comparing the observed outcomes in the data samples to their corresponding probabilities
Use inductive reasoning to decide whether the conclusion passes or fails
If it fails, transposition implies we must reject
If it passes, then we must not reject



Inductive Reasoning for Evaluating Assumptions



Selection Bias in Inductive Reasoning
Improper use of inductive reasoning may lead to inaccurate, or biased conclusions
Data-generating process is typically the source of the bias
Survey questions constructed in a leading way
Confirmation bias is the tendency to confirm a claim
Predictable patterns are discovered
Predictable-world bias is the tendency to find order when none exists, and occurs when people “read too much” into perceived patterns from random data



Selection Bias
Selection bias
The act of drawing conclusions about a population using a selected data sample, without accounting for the means of selection
There are two common types:
Collector selection bias occurs when the collector selects the members of the data sample in a systematic way
Availability bias occurs when the collector of the data sample selects the members of the data sample according to what is most readily available
Member selection bias occurs when potential members of the data sample self-select into, or out of, the sample



