Causal Inquiry Dialogue

While writing the "Theory of Rationality" post, I revisited the pragma-dialectical rules for critical discussion and Waltons extended dialogue types; then suddenly got an idea. Walton's extensions could possibly be extended further into sub-types. Why not extend this framework into a dialogue exclusively concerned with the nuances and features of causal inquiry? This interests me in particular because I'm quite familiar and interested in empirical methods used to identify causal effects in data. While in Economics graduate school, I distinctly recall most applied microeconometric research reducing to debates about whether X is a confounder, proper instrument, or if there is selection bias. This sure sounds like a sub-type of Walton's Inquiry Dialogue. Furthermore, Walton already has argumentation schemes for causal reasoning. Perhaps we could simply extend the existing work? Below I will first review the schemes then propose a dialogue sub-type.

1) Argument from Cause to Effect (C→E)

Canonical form (presumptive):
Generally, if A happens, B (likely) happens.
A happens (or will).
So, B will (likely) happen.

What to look for (evaluation criteria):

  • The strength and applicability of the causal regularity (“if A then B”).
  • Evidence that A really holds in this case.
  • Possible interveners/defeaters that could block B.
  • Plausible mechanism and proper temporal order (cause precedes effect)

Core Critical Questions (CQs):

  • CQ1 – Strength: How well-supported is the A→B generalization here?
  • CQ2 – Fact: Is the evidence that A occurs (here/now) good enough?
  • CQ3 – Interference: Are there other causal factors that would prevent B despite A?
  • CQ4 – Alternatives/base-rate: Could B occur anyway without A (or mainly from some other cause)? 

2) Argument from Effect to Cause (E→C)

Canonical form (abductive/IBE-style):
Generally, if A happens, B (likely) happens.
B is observed.
So, (presumably) A happened (as the best explanation). 

Walton treats E→C as defeasible and best understood in an abductive, dialogue-driven setting: you hypothesize the cause that would explain the effect, then test it against rivals and further evidence. 

What to look for (evaluation criteria):

  • Explanatory adequacy of A for B (fit, coherence with background knowledge).
  • Comparative superiority over rival explanations.
  • Search thoroughness and stage of inquiry (is it too early to commit?).
  • Predictive leverage/testability (does A lead to further checkable consequences?).

Core Critical Questions (CQs) used here (IBE-style):

  • CQ1 – Adequacy: Does A genuinely explain B (independently of rivals)?
  • CQ2 – Bestness: Is A a better explanation than the alternatives considered so far?
  • CQ3 – Inquiry status: Have we looked hard enough (could more inquiry flip our judgment)?
  • CQ4 – Prudence: Should we withhold conclusion and investigate further before accepting A? 

3) Argument from Correlation to Cause (Corr→C)

Canonical move (heuristic):
A and B are correlated.
Therefore, (tentatively) A causes B (or B causes A, or there’s a causal link).
This is a short, defeasible bridge from association to a causal hypothesis that then needs bolstering. 

What to look for (evaluation criteria):

  • Reality/robustness of the correlation (replication, effect size, sampling).
  • Directionality and temporal precedence (does putative cause precede effect?).
  • Non-spuriousness (rule out third variables/common causes; avoid mere coincidence).
  • Mechanistic plausibility (is there a credible pathway from A to B?). 

Standard Critical Questions (CQs):

  • CQ1 – Reality: Is there really a correlation between A and B?
  • CQ2 – Coincidence: Is the correlation more than just coincidence?
  • CQ3 – Third factor: Could some third factor C be causing both A and B?
  • (Extended lists add temporal order and mechanism checks, but these three are the widely used core.) 
Walton explicitly brings the Bradford Hill “considerations” into his treatment of arguments from correlation to causation and treats them as a bank of critical questions for evaluating presumptive causal claims (e.g., he lists Hill’s items under the correlation→causation chapter in Argument Evaluation and Evidence). 
  1. Temporality – the putative cause precedes the effect.
    • Limits: necessary but not sufficient; onset lags and feedback loops complicate timing.
    • Extend: represent timing explicitly in a DAG or study protocol (avoid immortal-time bias via target-trial emulation).

  2. Strength (effect size) – larger associations are harder to dismiss as bias.
    • Limits: confounding can inflate/deflate effect sizes; small true effects exist.
    • Extend: use bias analysis/sensitivity analyses; triangulate across designs. 

  3. Consistency (reproducibility) – seen across studies, settings, methods.
    • Limits: heterogeneity can reflect real effect-modification, not error.
    • Extend: plan for heterogeneity (subgroup DAGs), and weigh differences across designs in triangulation. 

  4. Specificity – a cause leads to a single effect (or a very specific pattern).
    • Limits: rarely holds for multifactorial diseases; historically overemphasized.
    • Extend: replace with pattern specificity: look for distinctive constellations predicted by mechanisms. 

  5. Biological gradient (dose–response) – more exposure → more effect.
    • Limits: thresholds, U-shapes, saturation; exposure misclassification.
    • Extend: model nonlinearity; use negative controls and quantitative bias analysis.

  6. Plausibility – is there a credible mechanism?
    • Limits: theory-laden and time-bound (today’s “implausible” can be tomorrow’s accepted biology).
    • Extend: pair difference-making evidence with mechanistic evidence (Russo–Williamson thesis). 

  7. Coherence – fits with what else we know (lab, natural history, theory).
    • Limits: “coherence” can be vague; risk of confirmation bias.
    • Extend: make coherence testable by encoding background knowledge in DAGs and checking for implied conditional independencies.

  8. Experiment – manipulation changes outcomes (e.g., RCTs, natural experiments).
    • Limits: often infeasible/unethical; trials may lack external validity.
    • Extend: target-trial emulation with observational data; exploit quasi-experiments. 

  9. Analogy – by similarity to known causal relations.
    • Limits: weakest and most subjective; easy to cherry-pick analogues.
    • Extend: specify which similarities matter and test their implications (structured analogical mapping). 

Walton’s causal argumentation schemes are defeasible and come with critical questions. Bradford Hill’s considerations plug in naturally as such questions when someone argues from correlation to cause:
  • “Does cause precede effect?” (Temporality)
  • “How strong/consistent is the association across settings?” (Strength, Consistency)
  • “Could a third factor explain both?” (Strength/Consistency via confounding)
  • “Is there a dose–response? A plausible mechanism? Coherence with other evidence?” (Gradient, Plausibility, Coherence)
  • “Any experimental or quasi-experimental confirmation? Are there relevant analogies?” (Experiment, Analogy)
  • Walton’s chapter explicitly points students to Hill (1965) when assessing correlation→causation moves. 

Practical limitations (why Hill alone isn’t enough)

  • Confounding/selection/collider bias can satisfy several considerations spuriously. DAGs help diagnose these risks.
  • Over-formalizing the list as a pass/fail test misses its heuristic intent. Modern reviews emphasize using Hill as guidance within a broader causal framework.

Bradford-Hill can be modernized with sensible extensions that account for modern improvements in the causal literature:
  1. Make the causal question explicit and design to answer it: Use the target-trial emulation playbook for observational studies (eligibility, treatment strategies, time zero, outcomes, estimand).
  2. Represent assumptions with DAGs: Encode background knowledge, identify confounding/selection structures, and derive testable implications that operationalize “coherence.” 
  3. Triangulate across methods and biases: Combine evidence differing in key sources of bias (e.g., MR, natural experiments, cohorts, case-crossovers) to strengthen inference.
  4. Blend mechanisms + difference-making: Use the Russo–Williamson insight: require both probabilistic/difference-making evidence and mechanistic support, instead of treating “plausibility” as a soft afterthought.
  5. Bias-aware quantification: Routine sensitivity analyses for unmeasured confounding and measurement error alongside effect estimates (refines “strength” and “consistency”). 
  6. Keep Hill as critical questions: Treat each item as a Walton-style CQ attached to the correlation→cause scheme; use them to structure inquiry rather than to “grade” causality. (This is exactly how Walton deploys them pedagogically.) 

Here are some more sources to consider before moving on;
Now, here is a proposed "Causal Inquiry Dialogue", modeled after Waltons Inquiry Dialogue, capturing the dynamics of causal inquiry across disciplines:

Causal Inquiry Dialogue (CID)

1) Purpose (telos) & product

  • Goal: arrive at a warranted causal claim (or non-claim) about a well-specified effect of a well-specified cause, under explicit assumptions and with stated uncertainty.
  • Product: a conclusion tagged with (a) the estimand (what causal quantity), (b) scope (population, time, context), (c) assumptions & design, (d) robustness (sensitivity, rival explanations), and (e) status (accept/reject/suspend).

2) Initial situation

  • Anomaly, association, or policy problem triggers the dialogue.
  • Participants share incomplete knowledge and agree to rules that prioritize discovery over winning.

3) Roles (can be distributed across 2+ parties)

  • Proponent: advances a causal hypothesis and identification strategy.
  • Challenger: raises targeted doubts, counter-hypotheses, and tests.
  • Methodologist (optional role, often shared): vets design/assumptions and proposes diagnostics.
  • Mechanism expert (optional): articulates or tests mechanistic pathways.

4) Key commitments set in the Opening stage

  • Question clarity: state the causal question in counterfactual terms and name the estimand (e.g., ATE/CATE, effect of treatment on the treated, etc.).
  • Target trial/target system: define time-zero, eligibility, treatment strategies, outcomes, follow-up, and causal contrast.
  • Model sketch: present an initial DAG or structured mechanism showing confounders, mediators, colliders.
  • Burden of proof: proponent carries the forward burden (positive case); challengers carry a specific defeater burden (to point to concrete alternative mechanisms, biases, or tests).

5) Stages (specializing pragma-dialectics)

Stage I — Problem Formulation

  • Moves: propose_hypothesis(H), state_estimand(ψ), define_population(P), define_time_zero(T0).
  • Rule CID-1 (Clarity): no arguments about “causation” without a named estimand, population, and time-zero.

Stage II — Modeling & Identification

  • Moves: propose_DAG(G), justify_identification(I) (randomization, IV, DiD, RD, g-methods, etc.), list_assumptions(A).
  • Rule CID-2 (Identification): proponent must present a legible identification strategy and its assumptions; no hidden estimand drift.
  • Rule CID-3 (Comparative space): acknowledge plausible rival hypotheses and pathways.

Stage III — Evidence & Testing

  • Moves: propose_design(D), present_evidence(E), run_diagnostic(QC) (placebo/negative controls, balance checks, pre-trend checks, falsification tests, sensitivity/E-values).
  • Rule CID-4 (Relevance & Quality): evidence must be probative for the stated estimand under G and A (no p-hacking, data dredging, or post-hoc model switching without disclosure).
  • Rule CID-5 (Robustness): report sensitivity to key unverifiable assumptions (unmeasured confounding, measurement error, model misspecification).

Stage IV — Mechanisms & Coherence

  • Moves: propose_mechanism(M), predict_signature(S) (dose-response, lags, subgroup patterns), check_coherence(K) with background knowledge.
  • Rule CID-6 (Mechanistic–probabilistic pairing): difference-making evidence should be paired with mechanistic articulation (even if partial) or the conclusion stays provisional.

Stage V — Comparative Appraisal & Rival Explanations

  • Moves: table_rivals(R1..Rn), compare_fit(C), choose_best_explanation(BE).
  • Rule CID-7 (Best-explanation discipline): address live rivals; “victory by default” is disallowed.

Stage VI — Conclusion & Reporting

  • Moves: accept/ reject/ suspend, qualify_scope, state_uncertainty, declare_limits & next tests.
  • Rule CID-8 (Transparency & Scope): publish the warrant ledger (what supports the claim) and external validity limits.

6) Permitted dialogue moves (locutions) & their commitments

  • assert_association(A,B) → commit to reproducible measurement and design details.
  • propose_DAG(G) → commit to edges/omissions as working assumptions open to targeted challenge.
  • challenge_edge(X→Y) → must specify the basis (confounder, collider, measurement, selection).
  • propose_test(T) / request_sensitivity(S) → the other side must either perform it (if feasible) or justify non-feasibility.
  • propose_trial_or_quasi(E) → moves the dialogue toward experimentation when observational inference stalls.
  • assert_extrapolation(EXT) → requires a bridging argument for transportability (similarity of mechanisms/distributions).

7) Evaluation grid: Critical Questions specialized for causality

Attach these CQs at the points they’re most discriminating (they operationalize Bradford-Hill style concerns without score-keeping):

  1. Temporality CQ: Is time-zero defined, and does exposure precede effect with plausible lags?
  2. Identification CQ: Under G and A, is ψ identified? What assumptions are unverifiable?
  3. Confounding CQ: Which unmeasured factors could generate the association, and how strong must they be (sensitivity/E-value)?
  4. Design CQ: Do design diagnostics pass (balance, parallel trends, bandwidth/placebo tests, instrument strength/exclusion)?
  5. Robustness CQ: Do results survive alternative specifications, samples, and measures?
  6. Mechanism CQ: Is there an articulated pathway predicting distinctive signatures (dose–response, subgroup, temporal pattern) that we observe?
  7. Coherence CQ: Are implications consistent with other data/modalities (lab, quasi-experiments, natural history)?
  8. Rivals CQ: What are the best rival explanations and how do they fare on fit and testability?
  9. Transportability CQ: What changes if we move population/context; which assumptions support extrapolation?
  10. Decision CQ (if policy-relevant): Given current warrant and uncertainty, what are the consequences of acting vs. waiting?

8) Typical derailments (fallacies-as-violations)

  • Post hoc ergo propter hoc (violates CID-1 & Temporality CQ).
  • Confounding by indication / selection bias (violates CID-2/3).
  • Collider conditioning (violates CID-2 by corrupting identification).
  • P-hacking / garden of forking paths (violates CID-4).
  • Mechanism hand-waving (violates CID-6; claim remains provisional).
  • Rival neglect (violates CID-7).
  • Target drift (quietly changing estimand/design mid-stream; violates CID-2 & CID-8).
  • Illicit dialogue shift (sliding from inquiry into eristic or policy advocacy without declaring a shift and its rules).

9) Stopping rules (how to close the dialogue)

  • Accept (provisional): ψ is identified; diagnostics satisfactory; rivals addressed; mechanism articulated (or explicitly limited); uncertainty quantified; scope stated.
  • Suspend: open defeaters remain or diagnostics fail; specify the next most probative test.
  • Reject: identification breaks, key diagnostics fail, or a rival clearly dominates.

10) How this extends Walton + pragma-dialectics

  • Keeps inquiry’s discovery telos but adds causal-specific rules (estimand clarity, DAG/identification discipline, sensitivity obligations, rival management, and mechanism pairing).
  • Preserves the pragma-dialectical spirit (burden of proof, relevance, clarity, closure) with domain-specific instantiations that make “rationality” depend on good causal practice in a cooperative exchange.

11) Minimal “protocol card” you can use

  1. State ψ, P, T0 → what effect, on whom, from when?
  2. Show G, name I → DAG + identification strategy.
  3. Run QC → diagnostics + sensitivity/negative controls.
  4. Predict signatures → dose/lag/subgroups; check them.
  5. Confront rivals → articulate, test, compare.
  6. Conclude with scope & uncertainty → accept/suspend/reject; next test.

Below is a more detailed description of some of the function declarations listed above. 1-4 represent problem formation, 5-7 represent modeling & identification, 8-10 represent Evidence & Testing, 11-13 represent mechanism & coherence, 14-16 represent comparative appraisal, 17-20 represent conclusions & reporting, and 21-26 represent cross cutting challenge or response moves. How to use in practice:
  1. propose_hypothesis(H)
    • Purpose: Put a causal claim on the table in plain language.
    • When: Opening of inquiry.
    • Inputs: Cause, effect, context (e.g., “X increases Y among P during T”).
    • Output/Commitments: Commit to making the claim testable (estimand, design, diagnostics).
    • Acceptance checks: Clear, falsifiable, time-anchored.
    • CQs: Directional? Time-ordered? Contextualized?
    • Pitfalls: Vagueness (“X affects Y somehow”).
    • Example: “Traffic-related PM2.5 increases asthma ER visits among adults in County Z, 2016–2020.”
  2. state_estimand(Ψ)
    • Purpose: Turn the hypothesis into a named causal quantity.
    • When: Immediately after H.
    • Inputs: Target population, exposure contrast, outcome, time-zero, follow-up, summary (ATE/CATE/ATT).
    • Output/Commitments: A precise counterfactual query you will identify/estimate.
    • Acceptance checks: Formally writable (e.g., E[Y(1)−Y(0)] over window W).
    • CQs: Is the contrast precise? Are time & population fixed?
    • Pitfalls: Estimand drift later.
    • Example: “Ψ = ATT of odd-even traffic restrictions on ER visits within 7 days among registered drivers.”
  3. define_population(P)
    • Purpose: Lock in who Ψ is about.
    • When: With Ψ.
    • Inputs: Inclusion/exclusion, geography, time, eligibility.
    • Output/Commitments: Sampling frame; transportability boundaries.
    • Acceptance checks: Replicable inclusion criteria.
    • CQs: Any selection mechanisms tied to exposure/outcome?
    • Pitfalls: Convenience samples inducing bias.
    • Example: “Adults 18–65 in County Z, continuously insured, 2016–2020.”
  4. define_time_zero(T0)
    • Purpose: Fix the start of risk/measurement (avoid immortal-time bias).
    • When: With Ψ/P.
    • Inputs: Operational timestamp for exposure assignment & follow-up.
    • Output/Commitments: All timing claims become checkable.
    • Acceptance checks: Exposure precedes outcome; lags specified.
    • CQs: Is temporality satisfied for everyone in P?
    • Pitfalls: Post-exposure covariates treated as baseline.
    • Example: “T0 = 00:00 on restriction day; outcomes over next 7 days.”
  5. propose_DAG(G)
    • Purpose: Externalize assumed causal structure.
    • When: Early modeling.
    • Inputs: Nodes (exposure, outcome, covariates), edges (assumed arrows).
    • Output/Commitments: You own edges & omissions; open to targeted challenge.
    • Acceptance checks: G justifies a concrete adjustment/strategy.
    • CQs: Confounders, mediators, colliders correctly classified?
    • Pitfalls: Post-treatment adjustment; omitted common causes.
    • Example: Weather & mobility → PM2.5 & ER; PM2.5 → ER; no ER → PM2.5.
  6. justify_identification(I)
    • Purpose: Show how Ψ is identified under G + assumptions.
    • When: After G.
    • Inputs: Strategy (RCT, IV, RD, DiD, panel FE, g-methods, matching, TMLE, etc.) with identification conditions.
    • Output/Commitments: Map assumptions → estimand; diagnostic plan.
    • Acceptance checks: Clear conditions (exchangeability, positivity/SUTVA; IV relevance/exclusion; RD continuity; DiD parallel trends).
    • CQs: Are conditions plausible? How will you check them?
    • Pitfalls: “Black-box” ML with no identification story.
    • Example: DiD with city×week FE; 24-week pre-trends; matched control cities.
  7. list_assumptions(A)
    • Purpose: Make hidden levers visible.
    • When: With I.
    • Inputs: Testable & untestable assumptions; measurement & linkage assumptions.
    • Output/Commitments: Each assumption gets a diagnostic or sensitivity plan.
    • Acceptance checks: Feasible tests or reasoned defense for untestables.
    • CQs: Which assumption, if broken, flips the conclusion?
    • Pitfalls: Hand-waving (“no unmeasured confounding”).
    • Example: No cross-border migration this week; wind-shift IV affects ER only via PM2.5.
  8. propose_design(D)
    • Purpose: Commit to a concrete empirical design before results.
    • When: Pre-analysis / design registration.
    • Inputs: Data sources, inclusion rules, variables, transformations, windows, bandwidths, models.
    • Output/Commitments: A design others can reproduce.
    • Acceptance checks: Pre-specification or justified deviations.
    • CQs: Is D aligned with Ψ and I?
    • Pitfalls: Garden of forking paths; hidden post-hoc tweaks.
    • Example: Synthetic control; donor pool 20 counties; outcome = daily ER rate; covariates = weather, holidays.
  9. present_evidence(E)
    • Purpose: Put results on the table (estimates + uncertainty).
    • When: After D executed.
    • Inputs: Point estimates, intervals, diagnostics, robustness tables/figures.
    • Output/Commitments: Accept scrutiny relative to D, I, A, G, Ψ.
    • Acceptance checks: Traceable to design; uncertainty quantified; code/metadata if practical.
    • CQs: Consistent with identification diagnostics?
    • Pitfalls: Reporting only favorable specs; unit confusion.
    • Example: ATT = −6.2% (95% CI −9.8, −2.5); pre-trend p=0.62; placebo policies null.
  10. run_diagnostic(QC)
    • Purpose: Execute validity checks specific to I.
    • When: Alongside E.
    • Inputs: Balance/pre-trends, negative controls, IV F-stat, RD density, sensitivity (E-values), etc.
    • Output/Commitments: Abide by diagnostic implications (revise/suspend if they fail).
    • Acceptance checks: Pre-specified or justified; adequate power.
    • CQs: Do diagnostics support key assumptions?
    • Pitfalls: Underpowered/irrelevant tests; ignoring failures.
    • Example: RD McCrary p=0.47; IV first-stage F=28; negative-control outcome (fractures) null.
  11. propose_mechanism(M)
    • Purpose: Articulate a causal pathway (even partial) that makes predictions.
    • When: After initial E (or earlier if known).
    • Inputs: Biological/behavioral/economic mechanism; intermediates.
    • Output/Commitments: Mechanism-linked, testable implications.
    • Acceptance checks: Compatible with G and E.
    • CQs: Intermediates measurable? Implied lags/dose/subgroups?
    • Pitfalls: Vague “plausibility” with no predictions.
    • Example: Inflammatory pathways; expect lag 0–2 days; stronger in COPD subgroup.
  12. predict_signature(SIG)
    • Purpose: Turn M into observable signatures.
    • When: With M.
    • Inputs: A-priori patterns (dose–response, lags, subgroup/geo gradients).
    • Output/Commitments: Agree that non-appearance weakens the claim.
    • Acceptance checks: Predictions precise enough to test.
    • CQs: Are signatures unique to M or shared with rivals?
    • Pitfalls: Post-hoc signature invention.
    • Example: Stronger effects on high-exposure commuting days; no effect on fractures.
  13. check_coherence(K)
    • Purpose: Integrate with external strands of evidence.
    • When: After E & SIG.
    • Inputs: Lab/toxicology, quasi-experiments, history, mechanistic literature.
    • Output/Commitments: Place finding in broader web; explain inconsistencies.
    • Acceptance checks: Citations & comparability discussed; conflicts acknowledged.
    • CQs: Any discordant high-quality results? Why?
    • Pitfalls: Cherry-picking supportive studies.
    • Example: Animal models show inflammation within 24h; UK congestion-charge study shows similar ER reduction.
  14. table_rivals(R₁…Rₙ)
    • Purpose: Lay out live alternative explanations side-by-side.
    • When: Before concluding.
    • Inputs: Confounding, measurement error, selection, alternative causes, reverse causation.
    • Output/Commitments: Each rival gets proposed tests/diagnostics.
    • Acceptance checks: Rivals are plausible (not straw versions).
    • CQs: Which rival best fits residual patterns?
    • Pitfalls: Ignoring the serious rival.
    • Example: R1: heat waves; R2: care-seeking changes; R3: coding changes.
  15. compare_fit(COMP)
    • Purpose: Evaluate main hypothesis vs. rivals on fit & testability.
    • When: After R₁…Rₙ.
    • Inputs: Likelihood/posteriors, predictive checks, out-of-sample performance, qualitative pattern match.
    • Output/Commitments: Transparent scoring or narrative with criteria.
    • Acceptance checks: Uses pre-agreed criteria or justified ex post.
    • CQs: Does any rival explain signatures better?
    • Pitfalls: Changing metrics mid-stream.
    • Example: Heat waves fail (effects persist in cool weeks); care-seeking fails placebo tests.
  16. choose_best_explanation(BE)
    • Purpose: Make the abductive choice, or suspend.
    • When: End of appraisal.
    • Inputs: COMP outcome; decision threshold reflecting stakes.
    • Output/Commitments: Reasoned, defeasible selection; or suspension with next tests.
    • Acceptance checks: Clear rationale tied to diagnostics & signatures.
    • CQs: Risk of premature closure?
    • Pitfalls: Victory by default (rivals not addressed).
    • Example: Adopt main hypothesis provisionally; rivals underperform on pre-specified diagnostics.
  17. accept / reject / suspend
    • Purpose: Close the dialogue honestly.
    • When: After BE.
    • Inputs: Evidence grade, diagnostics, rival status.
    • Output/Commitments: If accept: provisional & scoped; if suspend: name next test; if reject: explain failure.
    • Acceptance checks: Closure matches warrant.
    • CQs: Are you over-claiming?
    • Pitfalls: Treating “accept” as certainty; or never closing.
    • Example: Accept (provisional): −6% effect; next: mechanism sub-study.
  18. qualify_scope(SCOPE)
    • Purpose: State where the claim applies.
    • When: With closure.
    • Inputs: Population, time, setting; transportability limits.
    • Output/Commitments: Boundaries for reuse/policy.
    • Acceptance checks: Matches P and data support.
    • CQs: Any reason scope would shrink/expand?
    • Pitfalls: Over-generalization.
    • Example: Urban counties with similar traffic mix; 2010s vehicle fleet.
  19. state_uncertainty(UQ)
    • Purpose: Make uncertainty first-class (statistical + structural).
    • When: With closure.
    • Inputs: CIs/posteriors; sensitivity ranges; model dependence; unknowns.
    • Output/Commitments: Honest map of confidence and fragility.
    • Acceptance checks: Quantified where possible; qualitative where needed.
    • CQs: What single assumption, if wrong, flips the sign?
    • Pitfalls: Reporting only sampling error.
    • Example: If unmeasured confounder RR 2.0 with 15% prevalence difference exists, effect may vanish.
  20. declare_limits_and_next_tests(NEXT)
    • Purpose: Record what remains uncertain and the next best test.
    • When: Final step.
    • Inputs: Open defeaters; feasible designs.
    • Output/Commitments: Concrete plan (experiment, new data, quasi-design).
    • Acceptance checks: Next test would truly discriminate.
    • CQs: Is the next step proportionate to stakes?
    • Pitfalls: Vague “more research needed.”
    • Example: IV using refinery outages; mechanistic biomarker panel in COPD clinic.
  21. assert_association(A,B)
    • Purpose: Put a descriptive association on record (not yet causal).
    • When: Early evidence marshalling.
    • Inputs: Estimand-free correlation/regression with proper denominators/weights.
    • Output/Commitments: Full description of how measured; ready for stress-tests.
    • Acceptance checks: Replicability; robustness to basic spec changes.
    • CQs: Is it real (not artifact)?
    • Pitfalls: Implicit causal spin.
    • Example: Pearson r=0.31 across 200 days; Spearman 0.29.
  22. challenge_edge(X→Y)
    • Purpose: Target a specific arrow in G.
    • When: After DAG proposed.
    • Inputs: Missing confounder, wrong direction, collider path, measurement error.
    • Output/Commitments: Challenger offers a concrete alternative or test.
    • Acceptance checks: Connects to data/design or literature.
    • CQs: Would change alter identification?
    • Pitfalls: Generic skepticism.
    • Example: Mobility is a common cause of PM2.5 and ER; omitting it biases ATT.
  23. propose_test(T)
    • Purpose: Add a discriminating check.
    • When: Any time a live uncertainty is identified.
    • Inputs: Diagnostic, required data, expected pattern if H vs. rival.
    • Output/Commitments: Other party runs it if feasible, or justifies infeasibility.
    • Acceptance checks: Test truly discriminates; adequate power.
    • CQs: What result would falsify?
    • Pitfalls: Non-diagnostic tests.
    • Example: Add placebo outcome (appendicitis). Effect should be null.
  24. request_sensitivity(S)
    • Purpose: Quantify robustness to unmeasured threats.
    • When: After initial E.
    • Inputs: E-values, Rosenbaum bounds, bias-factor grids; parameter ranges.
    • Output/Commitments: Run & report; interpret.
    • Acceptance checks: Transparent parameterization; realistic ranges.
    • CQs: Are requested ranges realistic?
    • Pitfalls: Cherry-picking benign ranges.
    • Example: Report E-value for point estimate and CI bound.
  25. propose_trial_or_quasi(XP)
    • Purpose: Escalate to intervention (RCT) or strong quasi-experiment if feasible.
    • When: When observational designs plateau.
    • Inputs: Sketch of randomization or natural experiment; ethics/logistics.
    • Output/Commitments: Consider seriously (or justify why not).
    • Acceptance checks: Would answer Ψ with fewer assumptions.
    • CQs: Is XP ethical, timely, powered?
    • Pitfalls: Dismissing feasible experiments.
    • Example: Randomize congestion-pricing start across districts.
  26. assert_extrapolation(EXT)
    • Purpose: Argue that findings transport to new settings.
    • When: After acceptance/scope.
    • Inputs: Bridge assumptions; similarities/differences; reweighting/transport formulas if used.
    • Output/Commitments: State what must hold in the target to carry over Ψ.
    • Acceptance checks: Structural & distributional alignment argued or shown.
    • CQs: Which differences would break transport?
    • Pitfalls: Hand-wavy generalization.
    • Example: Effect transports to City Q because fleet mix, baseline and compliance are similar; reweighted estimate shown.

Common Assumptions

  1. Consistency (Well-defined Interventions)
    • The observed outcome under the treatment actually received equals the potential outcome for that treatment.
    • Requires treatments to be clearly defined and consistently applied.
  2. Exchangeability (No Unmeasured Confounding / Ignorability)
    • Given observed covariates, treatment assignment is independent of potential outcomes.
    • After adjusting for covariates, treated and untreated groups are comparable as if randomized.
  3. Positivity (Overlap / Common Support)
    • Every individual has a positive probability of receiving each treatment level, given their covariates.
    • Without overlap, causal effects cannot be compared across certain subgroups.
  4. Stable Unit Treatment Value Assumption (SUTVA)
    • Each individual’s potential outcome depends only on their own treatment, not on others’ treatments (no interference).
    • No hidden versions of the treatment.
  5. Correct Model Specification (when using parametric models)
    • The statistical model is correctly specified (e.g., functional form, distributional assumptions).
    • Nonparametric approaches rely less on this assumption.
  6. Additional Method-Specific Assumptions
    • Instrumental Variables (IV): instrument relevance, exclusion restriction, and monotonicity.
    • Difference-in-Differences (DiD): parallel trends assumption.
    • Regression Discontinuity (RD): continuity of potential outcomes at the cutoff.

Concluding Remarks

For causal inquiry, this is a general outline for what to expect:
  1. Formulate: propose_hypothesis → state_estimand → define_population/define_time_zero.
  2. Model/Identify: propose_DAG → justify_identification → list_assumptions.
  3. Design/Estimate: propose_design → run → present_evidence + run_diagnostic.
  4. Mechanize & Compare: propose_mechanism → predict_signature → check_coherence → table_rivals → compare_fit → choose_best_explanation.
  5. Close: accept/reject/suspend + qualify_scope + state_uncertainty + declare_limits_and_next_tests.
  6. At any point: opponents deploy challenge_edge, propose_test, request_sensitivity, propose_trial_or_quasi, assert_extrapolation.



Comments

Popular posts from this blog

The Nature of Agnosticism Part 5

Core Concepts in Economics: Fundamentals

MAGA Psychology and an Example of Brain Rot