[5]{.chapter-number}  [A Decisions First Framework]{.chapter-title}

5 A Decisions First Framework

In the world of data-driven decision-making, it’s all too easy to fall into the trap of the “null ritual.” This ritual, as Gigerenzer, Krauss, and Vitouch (2004) point out, involves a slavish adherence to Null Hypothesis Significance Testing (NHST) without truly grasping its foundations or limitations. It’s akin to following a recipe without tasting the ingredients - you might end up with a statistically significant result, but one that lacks real-world meaning or utility.

Instead of mindlessly performing this ritual, we need a more thoughtful, purposeful approach. Enter the r glossary(“Decisions First Framework”). This approach flips the script, putting the focus squarely on the decisions we need to make and using data as a tool to illuminate the path forward. It helps us sidestep several pitfalls, including what Manski (2020) calls the Lure of Incredible Certitude- the tendency to present research findings with unwarranted certainty, often driven by pressure to provide clear-cut answers even when the data is ambiguous or uncertain.

One crucial reason for this shift is that NHST often misses the mark when it comes to answering the practical questions businesses need to address. Moreover, there’s a widespread misunderstanding of p-values. The American Statistical Association took the unusual step of issuing a statement in 2016 to sound the alarm on this issue (see Wasserstein and Lazar 2016). They noted that “practices that reduce data analysis or scientific inference to mechanical ‘bright-line’ rules (such as ‘p < 0.05’) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making.” They further clarified that “a p-value does not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.”

The “Decisions First” framework offers an antidote to the null ritual by embracing a Bayesian perspective. This approach treats learning from data as a continuous process, not a binary “accept/reject” decision based on a single p-value. Rather than fixating on point estimates, p-values, or confidence intervals, it emphasizes estimating probabilities that are relevant to the decisions at hand. It acknowledges the inherent uncertainty in data and seeks to quantify and integrate it into the decision-making process. In essence, Bayesian Statistics allows us to use data to reallocate credibility across various possibilities.

5.1 Define the Decision(s): The Cornerstone of Data-Driven Choices

The first, and arguably most crucial, step is to clearly articulate the decision(s) you need to make. This might involve launching a new product, adjusting pricing strategies, or optimizing marketing campaigns. Some decisions will be binary, but that’s not always the case. The key is that the optimal decision should hinge on the information available. If no amount of evidence will change your mind, there’s little point in designing a study to collect it.

5.2 Formulate Data-Driven Questions: Asking the Right Questions

With your decision clearly defined, it’s time to craft questions that data can answer and that directly inform your decision. These questions should be focused, actionable, and revolve around meaningful thresholds, rather than fixating solely on the null hypothesis (zero effect).

For instance, in a scenario involving the implementation of a chatbot, you might ask:

“Will a chatbot reduce average customer wait time by at least 15%?” Here, the threshold of interest isn’t whether there’s any reduction in wait time, but whether the reduction is substantial enough (15% or more) to justify implementing the chatbot.
“Will a chatbot increase customer satisfaction scores by at least 10 points?” Similarly, the focus is on a meaningful increase in satisfaction, not just any statistically significant difference from the current baseline.

By establishing these thresholds, we align our data analysis with the real-world impact of our decisions. A 5% reduction in wait times, even if statistically significant, might not justify the cost of implementing a chatbot.

In many cases, a well-structured question can be surprisingly simple, often following the format “Does A do B among C compared to D?”

A: The intervention or action under evaluation (e.g., chatbot, new pricing).
B: Your clear definition of success (e.g., reduce wait times, increase sales).
C: The target population (e.g., all customers, a specific segment).
D: The alternative or baseline for comparison (e.g., no chatbot, current pricing).

However, sometimes you’ll encounter more nuanced questions like “What works for whom?” These situations involve evaluating multiple alternatives with the understanding that different options might be optimal for different groups within your population.

5.3 Design the Study: Tailoring Research to Your Decision

This stage involves selecting the appropriate research methodology to answer your questions. Crucially, the study design must be tailored to the specific decision you’re facing. Factors to consider include data availability, experimental design, and potential biases. Additionally, ethical considerations should be at the forefront of your design, ensuring the study does no harm and respects the rights of participants.

Chatbot Example: If you’re exploring whether to offer a chatbot as an option, an A/B test where some customers are offered the chatbot while others follow the standard process might be suitable. However, if you’re considering making the chatbot the only option, your study design needs to reflect this forced-choice scenario.
Event Invitation Example: If you want to understand the value of inviting people to an event, randomizing invitations and analyzing attendance rates is a valid approach. But if you want to understand the value of actually attending the event, you’d focus on outcomes for attendees, even if the data comes from the same experiment.

The key is to ensure your study design mirrors the real-world conditions of the decision as closely as possible.

5.4 Present Findings and Implications: Communicating Clearly and Transparently

After conducting your study, present the results clearly, concisely, and accessibly. Avoid jargon that could confuse your audience. Even a meticulously designed study can lead to misinformed decisions if the findings are poorly communicated. Be transparent about your learnings, acknowledge any limitations of the study, and highlight new questions that have arisen.

In discussing limitations, it’s crucial to distinguish between Internal Validity (the confidence that the observed effects are due to the factor you’re studying) and External Validity (the extent to which the results can be generalized to other situations). Even with a flawless experimental design, questions of external validity might remain. For example, a chatbot study conducted with tech-savvy users might not apply to an older demographic.

5.5 Real-World Constraints and the Path Forward

Real-world constraints often prevent us from conducting the “perfect” study with both impeccable internal and external validity. Therefore, it’s essential to view evidence quality as a spectrum, not a binary. Learning is an ongoing process. Embrace uncertainty, acknowledging that no single study provides all the answers. Instead of thinking in terms of “success” or “failure,” consider the weight of evidence, the specific context, and the potential risks and rewards when making decisions.

By adopting the “Decisions First” framework and embracing a Bayesian approach, you can transform data analysis from a ritualistic exercise into a powerful tool for making informed, impactful decisions. This approach not only aligns your research with your business objectives but also acknowledges the complexities and uncertainties inherent in real-world decision-making.

Learn more

Duke (2019) Thinking in bets: Making smarter decisions when you don’t have all the facts.

--- title: "A Decisions First Framework" share: permalink: "https://book.martinez.fyi/decisions_first.html" description: "Business Data Science: What Does it Mean to Be Data-Driven?" linkedin: true email: true mastodon: true --- ```{r} #| warning: false #| echo: false #| results: asis library(glossary) glossary_path("glossary.yml") glossary_popup("hover") glossary_style("purple", "underline") glossary_popup("hover") glossary_add(term = "Bayesian Statistics", def = "A statistical approach that allows for updating beliefs or probabilities based on new evidence, rather than relying on fixed hypotheses.", replace = TRUE) glossary_add(term = "Decisions First Framework", def = "A decision-making approach that prioritizes clearly defining the decision to be made before gathering and analyzing data.", replace = TRUE) glossary_add(term = "External Validity", def = "The extent to which the results of a study can be generalized to other populations, settings, or situations.", replace = TRUE) glossary_add(term = "Internal Validity", def = "The confidence that the observed effects in a study are truly caused by the factor being studied, rather than other confounding variables.", replace = TRUE) glossary_add(term = "Lure of Incredible Certitude", def = "The tendency to present research findings with unwarranted certainty, often driven by pressure to provide clear-cut answers.", replace = TRUE) glossary_add(term = "Null Ritual", def = "The practice of rigidly adhering to NHST without fully understanding its assumptions or limitations.", replace = TRUE) glossary_add(term = "P-value", def = 'the probability, assuming a certain statistical model, that a statistical summary of the data (such as the sample mean difference between two groups) would be equal to or more extreme than its observed value. While a P-value indicates how "incompatible" the data are with a specified statistical model, it does not measure the probability that the hypothesis under study is true nor does it measure the probability that the data were produced by random chance alone.', replace = TRUE) glossary_add(term = "Meaningful threshold", def = "A predetermined level of effect or change that is considered meaningful for decision-making purposes.", replace = TRUE) ``` <img src="img/decisions_first.jpg" align="right" height="280" alt="A decisions-first framework"/> In the world of `r glossary("data-driven")` decision-making, it's all too easy to fall into the trap of the "null ritual." This ritual, as @gigerenzer2004null point out, involves a slavish adherence to Null Hypothesis Significance Testing (NHST) without truly grasping its foundations or limitations. It's akin to following a recipe without tasting the ingredients - you might end up with a statistically significant result, but one that lacks real-world meaning or utility. Instead of mindlessly performing this ritual, we need a more thoughtful, purposeful approach. Enter the r glossary("Decisions First Framework"). This approach flips the script, putting the focus squarely on the decisions we need to make and using data as a tool to illuminate the path forward. It helps us sidestep several pitfalls, including what @manski2020lure calls the `r glossary("Lure of Incredible Certitude")`- the tendency to present research findings with unwarranted certainty, often driven by pressure to provide clear-cut answers even when the data is ambiguous or uncertain. One crucial reason for this shift is that NHST often misses the mark when it comes to answering the practical questions businesses need to address. Moreover, there's a widespread misunderstanding of `r glossary("p-value", "p-values")`. The American Statistical Association took the unusual step of issuing a statement in 2016 to sound the alarm on this issue [see @wasserstein2016asa]. They noted that "practices that reduce data analysis or scientific inference to mechanical 'bright-line' rules (such as 'p \< 0.05') for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making." They further clarified that "a p-value does not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone." The "Decisions First" framework offers an antidote to the null ritual by embracing a Bayesian perspective. This approach treats learning from data as a continuous process, not a binary "accept/reject" decision based on a single p-value. Rather than fixating on point estimates, p-values, or confidence intervals, it emphasizes estimating probabilities that are relevant to the decisions at hand. It acknowledges the inherent uncertainty in data and seeks to quantify and integrate it into the decision-making process. In essence, `r glossary("Bayesian Statistics")` allows us to use data to reallocate credibility across various possibilities. ## Define the Decision(s): The Cornerstone of Data-Driven Choices The first, and arguably most crucial, step is to **clearly articulate the decision(s) you need to make**. This might involve launching a new product, adjusting pricing strategies, or optimizing marketing campaigns. Some decisions will be binary, but that's not always the case. The key is that the optimal decision should hinge on the information available. If no amount of evidence will change your mind, there's little point in designing a study to collect it. ## Formulate Data-Driven Questions: Asking the Right Questions With your decision clearly defined, it's time to craft questions that data can answer and that directly inform your decision. These questions should be focused, actionable, and revolve around `r glossary("meaningful threshold", "meaningful thresholds")`, rather than fixating solely on the null hypothesis (zero effect). For instance, in a scenario involving the implementation of a chatbot, you might ask: + "Will a chatbot reduce average customer wait time by at least 15%?" Here, the threshold of interest isn't whether there's any reduction in wait time, but whether the reduction is substantial enough (15% or more) to justify implementing the chatbot. + "Will a chatbot increase customer satisfaction scores by at least 10 points?" Similarly, the focus is on a meaningful increase in satisfaction, not just any statistically significant difference from the current baseline. By establishing these thresholds, we align our data analysis with the real-world impact of our decisions. A 5% reduction in wait times, even if statistically significant, might not justify the cost of implementing a chatbot. In many cases, a well-structured question can be surprisingly simple, often following the format "Does A do B among C compared to D?" - A: The intervention or action under evaluation (e.g., chatbot, new pricing). - B: Your clear definition of success (e.g., reduce wait times, increase sales). - C: The target population (e.g., all customers, a specific segment). - D: The alternative or baseline for comparison (e.g., no chatbot, current pricing). However, sometimes you'll encounter more nuanced questions like "What works for whom?" These situations involve evaluating multiple alternatives with the understanding that different options might be optimal for different groups within your population. ## Design the Study: Tailoring Research to Your Decision This stage involves selecting the appropriate research methodology to answer your questions. Crucially, **the study design must be tailored to the specific decision you're facing.** Factors to consider include data availability, experimental design, and potential biases. **Additionally, ethical considerations should be at the forefront of your design, ensuring the study does no harm and respects the rights of participants.** - **Chatbot Example:** If you're exploring whether to offer a chatbot as an option, an A/B test where some customers are offered the chatbot while others follow the standard process might be suitable. However, if you're considering making the chatbot the only option, your study design needs to reflect this forced-choice scenario. - **Event Invitation Example:** If you want to understand the value of inviting people to an event, randomizing invitations and analyzing attendance rates is a valid approach. But if you want to understand the value of actually attending the event, you'd focus on outcomes for attendees, even if the data comes from the same experiment. The key is to ensure your study design mirrors the real-world conditions of the decision as closely as possible. ## Present Findings and Implications: Communicating Clearly and Transparently After conducting your study, present the results clearly, concisely, and accessibly. Avoid jargon that could confuse your audience. Even a meticulously designed study can lead to misinformed decisions if the findings are poorly communicated. Be transparent about your learnings, acknowledge any limitations of the study, and highlight new questions that have arisen. In discussing limitations, it's crucial to distinguish between `r glossary("Internal Validity")` (the confidence that the observed effects are due to the factor you're studying) and `r glossary("External Validity")` (the extent to which the results can be generalized to other situations). Even with a flawless experimental design, questions of external validity might remain. For example, a chatbot study conducted with tech-savvy users might not apply to an older demographic. ## Real-World Constraints and the Path Forward Real-world constraints often prevent us from conducting the "perfect" study with both impeccable internal and external validity. Therefore, it's essential to view evidence quality as a spectrum, not a binary. Learning is an ongoing process. Embrace uncertainty, acknowledging that no single study provides all the answers. Instead of thinking in terms of "success" or "failure," consider the weight of evidence, the specific context, and the potential risks and rewards when making decisions. By adopting the "Decisions First" framework and embracing a Bayesian approach, you can transform data analysis from a ritualistic exercise into a powerful tool for making informed, impactful decisions. This approach not only aligns your research with your business objectives but also acknowledges the complexities and uncertainties inherent in real-world decision-making. ::: {.callout-tip} ## Learn more @duke2019thinking Thinking in bets: Making smarter decisions when you don't have all the facts. :::