Clarifying Scientific Concepts Part 2: Theories

I am not going to focus on any particular theory. I just want to consider in general, what it means to theorize in a scientific setting, how this activity differs from something like philosophical theorizing, and how both activities are quite different from how the public understands the term.

In the broadest sense of the term, theory is a structured way of understanding, interpreting, or explaining phenomena. It provides a conceptual framework, a network of ideas that helps us make sense of observations, connect patterns, and predict or interpret outcomes. Theorizing is something humans do all the time; often when you are trying to explain something, you are assuming some underlying theory (although its normally implicit and not fully structured). Theorizing in the broadest sense, is any process of pattern finding, meaning making, or framework building. It’s the creative and interpretive act of connecting ideas into a coherent picture — whether the “data” are experiments, emotions, social behaviors, or symbols.

In science, theorizing takes on specific methodological and epistemic constraints. Scientific theories must be testable, falsifiable, and consistent with empirical data. These are often formalized, expressed mathematically, aimed at predictive power. So while all scientific theories are theories, not all theories are scientific. Science narrows the broader act of theorizing into a disciplined method: empirical, systematic, and verifiable. In philosophy, theorizing is often about conceptual analysis rather than empirical testing. Philosophical theories often deal with abstractions more removed from empirical reality; it is not connected to experimental methods but rather focuses on logical entailment. It might deal with concepts like possibility and necessity. You might be eager to claim that science deals with these concepts as well. You'd be correct, certain scientific theories entail the possibility and impossibility of various empirical outcomes. Philosophical possibility is much broader, consisting of what is logically possible; in other words its theories are "metaphysical". So you can think of scientific and philosophical theorizing as specialized, formalized subsets of the larger, more universal human capacity to theorize — just like poetry and mathematics are specialized ways of using language.

There are common components to all theories, regardless of how fleshed out the theoretical details.

Concepts: the basic building blocks of a theory, they name and define the phenomena being discussed. For example, "gravity" in physics, or "motivation" in psychology. Concepts are abstractions, they simplify reality so we can think systematically about it.
Construct: A type of concept that has been deliberately defined for a specific theoretical purpose. Constructs often can’t be directly observed but are inferred (e.g. “intelligence,” “social capital,” “self-esteem”).
Propositions: These are statements that state the relationships between concepts, how one thing effects or relates to another. In formal sciences, these are hypotheses; in philosophy or critical theory, they may be argumentative claims. A well defined scientific theory generates testable hypotheses amenable to falsification.
Assumptions: These are the underlying ideas or conditions taken for granted for the theory to work. For example, in Economics we often assume humans are rational decision-makers. Making assumptions explicit is key to understanding the scope and limits of a theory.
Boundaries and Scope Conditions: This is the "where and when" of a theory, what domain or context it applies to. For example, a psychological theory may explain individual behavior, not group dynamics.
Logical Structure: This is the theories internal organization, how its pieces fit together coherently and systematically. A good theory has internal consistency and avoids contradictions.
Empirical Linkages: This is how the theory connects to observation or experience. Theory entails certain observations, these are the predictions. In science, this means operational definitions and testability.

Theorizing isn’t just coming up with ideas — it’s a disciplined, iterative process of moving between observation, abstraction, and synthesis. Across domains, you can think of it as a cycle:

Observation or Problem Identification: It starts with noticing a phenomenon, inconsistency, or puzzle. “Something interesting is happening here — why?”
Conceptualization: Identify key elements and name them. Define concepts clearly and delimit what you’re focusing on.
Relationship Mapping: Propose how these elements relate. In science, this becomes hypotheses or models. In philosophy or social theory, this becomes conceptual arguments or dialectical relations.
Integration and Abstraction: Bring multiple relationships together into a systematic framework. The theory begins to generalize — it becomes more than a list of observations.
Validation or Evaluation: In science → testing with data, replication, falsification. In interpretive or critical theory → coherence, explanatory depth, ethical and practical adequacy.
Refinement and Extension: Theories evolve as new evidence or perspectives emerge. This is the “living” nature of theory — it’s continuously reshaped.

It's better to think of a "theory" as a living organism that changes, because we are always theorizing. This involves abstracting, synthesizing, questioning, and reframing. Our theories change in light of this process.

I've been reading a lot from Paul Smaldino recently, and think his description of theory is incredibly useful. Paul Smaldino doesn’t offer a single, neat “textbook” definition of theory in the way a philosophy-of-science treatise might, but across his writings we can reconstruct how he treats and uses theories. From his published work (on modeling, methodology, philosophy of science), Smaldino’s view of theory includes the following aspects:

Decomposition into parts, properties, relationships, and dynamics: In “How to Build a Strong Theoretical Foundation,” Smaldino urges that to develop a theory of some phenomenon, one must decompose the system into relevant parts, specify the properties of those parts, articulate the relationships among them, and define how these can change over time. Thus, theory is not just a verbal or narrative statement, but a structural decomposition plus a specification of dynamics and interactions.
Theories are tools (not “Truth”): Smaldino is explicit that there is (in his view) no one “true” theory; rather, theories are evaluated by how useful they are for understanding, prediction, generalizability, and refinement. In other words, theory is pragmatic: it is judged by its capacity to guide thinking, to generate falsifiable hypotheses, to clarify assumptions, and to integrate with empirical work.
Verbal vs. formal theories / role of models: Smaldino repeatedly distinguishes verbal theories (narrative descriptions, “story-like”) from formal theories (mathematical or computational models). He argues that verbal theories are often vague, underdetermined, and thus resist strong testing or falsification. Formal models serve as instantiations of theory—they force explicit specification of assumptions, highlight omitted aspects, and allow rigorous exploration of consequences. In this view, a “good” theory is one that can be (or already is) translated into a formal model (or a family of models) that sharpen and test its claims.
Iterative and reflexive process: Smaldino sees theory construction as iterative: empirical work should refine the theory, and theory should shape what empirical questions get asked. He warns against treating data merely as support for a verbal theory; rather, data should prompt refinement, specification, or rejection of theoretical assumptions. Also, theory-building is reflexive: one must be conscious of which assumptions are built in (implicitly or explicitly), what is omitted for simplicity, and the “violence” (i.e., distortion) done to reality in modeling.
Theoretical foundation and training: Smaldino laments that many social scientists lack training in theory construction and formal modeling. In “How to Build a Strong Theoretical Foundation,” he argues for greater methodological and conceptual training so that theory is not just received (from canonical frameworks) but actively constructed. His emphasis is that theory is not peripheral—it is central. Without robust theory, methods (however sophisticated) may produce results without insight. (“Better methods can’t make up for mediocre theory.”)

So we could encapsulate his ideas with the following definition:

A theory is a deliberately constructed specification of (i) entities or components of a system, (ii) the properties and possible states of those components, (iii) the relationships and rules by which those components interact, and (iv) the temporal dynamics of how those states and relationships evolve. A strong theory is one that (a) can be formalized in mathematical or computational models, (b) offers testable predictions or counterfactuals, (c) is subject to empirical refinement, and (d) is judged not by an abstract “Truth” but by its utility in explaining, predicting, generalizing, and guiding further inquiry.

In his book “Modeling Social Behavior: Mathematical and Agent-Based Models of Social Dynamics and Cultural Evolution”, he defines theory as:

"... a set of assumptions upon which hypotheses derived from that theory must depend. Strong theories allow us to generate clear and falsifiable hypotheses."

Distinguishing it from a theoretical framework:

“A theoretical framework is a broad collection of related theories that all share a common set of core assumptions.”

Theories guide inquiry, and the modeling process. It frames what phenomena we pay attention to, what questions we ask, and how we model:

“Each [model] decomposes a system in a particular way … What questions does your theory address? What parts do you need to include to answer those questions? … Is your model a satisfying representation of your theory?”

That is, a theory is more than just a verbal narrative: it's the background of assumptions that define how one decomposes the phenomena, and from which hypotheses or models are generated. Formal models are instantiations or precise expressions of the theory, and are used as a way to stress test or refine the theory. There is a one to many relationship between theories and models; one theory can be expressed with many different models. This is what I take to be the scientific notion of theory, how I see it applied and how I was trained to apply the term (within the context of economic theory).

Theoretical Virtues

What counts as a "good" theory? How do we compare two theories explaining the same data? Why is simplicity considered desirable? Theoretical virtues are the criteria by which we compare competing theories. In addition to simplicity, there are other common virtues such as elegance (symmetry), explanatory power (unifying phenomena under one framework), fruitfulness (good at generating testable predictions), and coherence (with itself and other theories). Scientists often invoke these when deciding between theories that fit data equally well.

The weight given to each theoretical virtue varies across fields and context. Empirical adequacy is typically non-negotiable. In practice, scientists do appeal to simplicity, elegance, and explanatory depth — even if they don’t always articulate these as “philosophical criteria.” Generally, theoretical scientists (e.g., theoretical physicists, cosmologists, or mathematicians) care more explicitly about theoretical virtues because their work often advances ahead of decisive empirical data. For example, a String Theorist might emphasize mathematical beautify and unification, even though direct empirical tests might be lacking. Empiricists on the other hand, tend to prioritize measurable success and predictive reliability. The line dividing the two is by no means sharp.

We will look at a paper called "Systematizing the Theoretical Virtues". It provides a fairly comprehensive and structured account of the major theoretical virtues, and how they constitute a "logic of theory choice".

Evidential Virtues

Evidential accuracy: “A theory fits the empirical evidence well (regardless of causal claims).” Does the theory fit the data? This is the baseline virtue: the observable world looks the way the theory says it should. It’s neutral about causes; it’s just “getting the facts right.” Use it when comparing rivals that speak to the same dataset; watch for overfitting (a theory can “fit” because it has too much wiggle room). Evidential accuracy underwrites the other two evidential virtues: typically you assess causal adequacy and depth after you’ve seen solid fit.
Causal adequacy: “T’s causal factors plausibly produce the effects (evidence) in need of explanation.” Does the posited mechanism really have the oomph? Beyond fit, we ask whether the causes would in fact yield the observed effects (often many causes in interaction). Robustness analysis across heterogeneous models can support this by showing the same core causal structure yields the phenomenon across variations. Beware “dormant” causes that are merely named, not shown to operate at the required scale.
Explanatory depth: “Excels in causal history depth or in other depth measures such as the range of counterfactual questions that its law-like generalizations answer.” How far and how flexibly does the explanation reach? Depth comes in two flavors: (i) event-focused “how far back” causal history, and (ii) law-focused counterfactual range (how much would still hold under interventions or changed background conditions). It’s different from unification: depth concerns the same target system under varying conditions, not explaining more kinds of facts. Measure it by the breadth of stable “what-if” answers your laws support.

Coherential Virtues

Internal consistency: “T’s components are not contradictory.” No contradictions inside the theory. A minimal bar: if it derives P and ¬P, something must give. Subtle inconsistencies can hide in idealizations; don’t set the bar so high that all idealized modeling looks “inconsistent,” but don’t excuse genuine clashes as “just idealization,” either. Think formal coherence first, before aesthetic “niceness.”
Internal coherence: “Components are coordinated into an intuitively plausible whole… T lacks ad hoc hypotheses—components merely tacked on to solve isolated problems.” Parts hang together as an intuitively plausible whole (no ad hoc patches). Different from pure logic: a theory can be consistent yet obviously jury-rigged. Red flags: fixes that are untestable, explain nothing else, or sit awkwardly with the core principles. Use “negative” diagnosis (ad hocness) to pressure-test coherence.
Universal coherence: “T sits well with (or is not obviously contrary to) other warranted beliefs.” Fits with the rest of what we’re warranted to believe. This is external fit: harmony with well-established results and background commitments (including conservation principles, etc.). Clash here doesn’t instantly falsify, but it raises costs you must repay with exceptional evidential gains. Distinguish healthy tension (pushes progress) from outright conflict with robust knowledge.

Aesthetic Virtues

Beauty: “Evokes aesthetic pleasure in properly functioning and sufficiently informed persons.” The theory evokes aesthetic pleasure in appropriately situated observers. Beauty shows up as symmetry, aptness, “surprising inevitability,” etc. On Keas’s account, beauty may have extrinsic epistemic value (it can guide us toward other, more tightly connected virtues like simplicity and unification). Use with humility: beauty can inspire, but by itself it doesn’t guarantee truth.
Simplicity: “Explains the same facts as rivals, but with less theoretical content.” Same explananda, less theory. Think fewer entities (parsimony) and/or more concise principles (elegance). Practically, count independent parameters, primitive postulates, or distinct assumptions. Simplicity often correlates with better predictive performance in model selection, but it also interacts with coherence (ad hoc add-ons usually bloat a theory).
Unification: “Explains more kinds of facts than rivals with the same amount of theoretical content.” Same resources, more kinds of facts explained. Unification and simplicity are complementary “styles of informativeness”: simplicity reduces content for the same domain; unification expands domain for the same content. Use it to prefer frameworks that tie disparate phenomena together (Maxwell’s electrodynamics-light, plate tectonics, etc.). Keep distinct the diachronic notion (“consilience” gained over time) from this aesthetic one present at introduction.

Diachronic Virtues

Durability: “Has survived testing by successful prediction or plausible accommodation of new data.” Survives testing over time (prediction or plausible accommodation). Durability is not mere popularity or longevity: it’s testy time. Prediction is often the gold standard; in historical sciences, repeated plausible accommodation of novel data also counts. A newborn theory can’t yet be “durable”; this virtue is inherently time-laden.
Fruitfulness: “Over time, generates additional discovery by means such as successful novel prediction, unification, and non ad hoc theoretical elaboration.” Generates further discovery (incl. novel prediction, non-ad hoc elaboration, added unification). If durability is conservation (passing tests), fruitfulness is innovation (creating new testable strands). Novel prediction here is genuinely new—wasn’t “built in” as a target during construction. Fruitfulness and durability interlock in mature research traditions (e.g., gravitational astronomy from Uranus’s anomaly to Neptune).
Applicability: “Used to guide successful action or to enhance technological control… higher when it enables outcomes otherwise not possible.” Guides successful action or control (science → technology, policy). Distinct from experimental control for testing; this is practical leverage (engineering, medicine, forecasting). It’s confirmatory and arrives only after earlier virtues are in place (you can’t apply what you haven’t yet credibly learned), so it is inherently diachronic.

Keas systematizes these classes of virtues arranged roughly from greater to lesser immediate epistemic weight. They work together in a patterned way to guide theory choice and maturation across disciplines (with room for field-specific tweaks). He emphasizes that evidential accuracy (fit with data) isn’t isolated; it’s “entangled” with other virtues in practice. The upshot is a flexible, informal “logic of theory choice” scientists can use beyond any single field. Scientists rarely pick winners by fit alone. They move from fit → mechanism → depth, then ask whether the theory is internally clean and externally harmonious, while letting aesthetics steer search when data underdetermine choices. Over time, durability/fruitfulness/applicability settle the score. Keas’s taxonomy codifies this workflow many labs already follow implicitly. Physicists and other theorists often use simplicity and unification as discovery heuristics. Keas legitimizes this when tethered to evidential and diachronic payoffs (e.g., better predictive success). It clarifies why elegant-but-untestable ideas feel attractive yet remain provisional until durability and fruitfulness show up. By separating diachronic virtues, Keas shows why some theories only become “best explanations” after decades: testing, refinement, and practical leverage increase their epistemic standing. This perspective helps empiricists justify patience (or skepticism) toward brand-new frameworks. Keas doesn’t just list virtues; he systematizes how scientists already (often tacitly) weigh them—early emphasis on evidential and coherential virtues, aesthetic guidance when evidence underdetermines, and eventual elevation of theories that prove themselves durable, fruitful, and applicable over time.

Search This Blog

CriticalThinkingAcademy