# Conceptual Empiricism Essay Foundation Image Its Realism Science Science Series

## 1. Historical development in philosophy and science from Greek philosophy to Logical Empiricism in America

### 1.1 From Greek thought to Western science

Unity has a history as well as a logic. Different formulations and debates express intellectual and other resources and interests in different contexts. Questions about unity belong partly in a tradition of thought that can be traced back to pre-Socratic Greek cosmology, in particular to the preoccupation with the question of the One and the Many. In what senses are the world and, as a result, our knowledge of it one? A number of representations of the world in terms of a few simple constituents that were considered fundamental emerged: Parmenides’ static substance, Heraclitus’ flux of becoming, Empedocles’ four elements, Democritus’ atoms, or Pythagoras’ numbers, Plato’s forms, and Aristotle’s categories. The underlying question of the unity of our types of knowledge was explicitly addressed by Plato in the Sophist as follows: “Knowledge also is surely one, but each part of it that commands a certain field is marked off and given a special name proper to itself. Hence language recognizes many arts and many forms of knowledge” (Sophist, 257c). Aristotle asserted in On the Heavens that knowledge concerns what is primary, and different “sciences” know different kinds of causes; it is metaphysics that comes to provide knowledge of the underlying kind.

With the advent and expansion of Christian monotheism, the organization of knowledge reflected the idea of a world governed by the laws dictated by God, its creator and legislator. From this tradition emerged encyclopedic efforts such as the Etymologies, compiled in the sixth century by the Andalusian Isidore, Bishop of Seville, the works of the Catalan Ramon Llull in the Middle Ages and those of the Frenchman Petrus Ramus in the Renaissance. Llull introduced iconic tree-diagrams and forest-encyclopedias representing the organization of different disciplines including law, medicine, theology and logic. He also introduced more abstract diagrams—not unlike some found in Cabbalistic and esoteric traditions—in an attempt to combinatorially encode the knowledge of God’s creation in a universal language of basic symbols. Their combination would be expected to generate knowledge of the secrets of creation and help articulate knowledge of universal order (mathesis universalis), which would, in turn, facilitate communication with different cultures and their conversion to Christianity. Ramus introduced diagrams representing dichotomies and gave prominence to the view that the starting point of all philosophy is the classification of the arts and sciences. The encyclopedia organization of knowledge served the project of its preservation and communication.

The emergence of a distinctive tradition of scientific thought addressed the question of unity through the designation of a privileged method, which involved a privileged language and set of concepts. Formally, at least, it was modeled after the Euclidean ideal of a system of geometry. In the late 16th century, Francis Bacon held that one unity of the sciences was the result of our organization of records of discovered material facts in the form of a pyramid with different levels of generalities. These could be classified in turn according to disciplines linked to human faculties. Concomitantly, the controlled interaction with phenomena of study characterized so-called experimental philosophy. In accordance with at least three traditions—the Pythagorean tradition, the Bible’s dictum in the Book of Wisdom and the Italian commercial tradition of bookkeeping—, Galileo proclaimed at the turn of the 17th century that the Book of Nature had been written by God in the language of mathematical symbols and geometrical truths; and that in it, the story of Nature’s laws was told in terms of a reduced set of objective, quantitative primary qualities: extension, quantity of matter and motion. A persisting rhetorical role for some form of theological unity of creation should not be neglected when considering pre-20th-century attempts to account for the possibility and desirability of some form of scientific knowledge. Throughout the 17th century, mechanical philosophy and Descartes’ and Newton’s systematization from basic concepts and first laws of mechanics became the most promising framework for the unification of natural philosophy. After the demise of Laplacian molecular physics in the first half of the 19th century, this role was taken over by ether mechanics and, unifying forces and matter, energy physics.

### 1.2 Rationalism and Enlightenment

Descartes and Leibniz gave this tradition a rationalist twist that was centered on the powers of human reason and the ideal of system of knowledge, on a foundation of rational principles. It became the project of a universal framework of exact categories and ideas, a mathesis universalis (Garber 1992 and Gaukroger 2002). Adapting the scholastic image of knowledge, Descartes proposed an image of a tree in which metaphysics is depicted by the roots, physics by the trunk, and the branches depict mechanics, medicine and morals. Leibniz proposed a general science in the form of a demonstrative encyclopedia. This would be based on a “catalogue of simple thoughts” and an algebraic language of symbols, characteristica universalis, which would render all knowledge demonstrative and allow disputes to be resolved by precise calculation. Both defended the program of founding much of physics on metaphysics and ideas from life science (Smith 2011) (Leibniz’s unifying ambitions with symbolic language and physics extended beyond science, to settle religious and political fractures in Europe). By contrast, while sharing a model of geometric axiomatic structure of knowledge, Newton’s project of natural philosophy was meant to be autonomous from a system of philosophy and, in the new context, still endorsed for its model of organization and its empirical reasoning values of formal synthesis and ontological simplicity (see the entry on Newton and Janiak 2008).

Belief in the unity of science or knowledge, along with the universality of rationality, was at its strongest during the European Enlightenment. The most important expression of the encyclopedic tradition came in the mid-eighteenth century from Diderot and D’Alembert, editors of the Encyclopédie, ou dictionnaire raisonné des sciences, des arts et des métiers (1751–1772). Following earlier classifications by Nichols and Bacon, their diagram presenting the classification of intellectual disciplines was organized in terms of a classification of human faculties. Diderot stressed in his own entry, “Encyclopaedia”, that the word signifies the unification of the sciences. The function of the encyclopedia was to exhibit the unity of human knowledge. Diderot and D’Alembert, in contrast with Leibniz, made classification by subject the primary focus, and introduced cross-references instead of logical connections. The Enlightenment tradition in Germany culminated in Kant’s critical philosophy.

### 1.3 German tradition since Kant

Kant saw as one of the functions of philosophy to determine the precise unifying scope and value of each science. For Kant, the unity of science is not the reflection of a unity found in nature, or, even less, assumed in a real world behind the apparent phenomena. Rather, it has its foundations in the unifying a priori character or function of concepts, principles and of Reason itself. Nature is precisely our experience of the world under the universal laws that include some such concepts. And science, as a system of knowledge, is “a whole of cognition ordered according to principles”, and the principles on which proper science is grounded are a priori (Preface to Metaphysical Foundations of Natural Science). A devoted but not exclusive follower of Newton’s achievements and insights, he maintained through most of his life that mathematization and a priori universal laws given by the understanding of it were preconditions for genuine scientific character (like Galileo and Descartes earlier, and Carnap later, Kant believed that mathematical exactness constituted the main condition for the possibility of objectivity). Here Kant emphasized the role of mathematics coordinating a priori cognition and its determined objects of experience. Thus, he contrasted the methods employed by the chemist, a “systematic art” organized by empirical regularities, with those employed by the mathematician or physicist, which were organized by a priori laws, and held that biology is not reducible to mechanics—as the former involves explanations in terms of final causes—(see Critique of Pure Reason, Critique of Judgment and Metaphysical Foundations of Natural Science). With regards to biology—insufficiently grounded in the fundamental forces of matter—its inclusion requires the introduction of the idea of purposiveness (McLaughlin 1991). More generally, for Kant unity was a regulative principle of reason, namely, an ideal guiding the process of inquiry toward a complete empirical science with its empirical concepts and principles grounded in the so-called concepts and principles of the understanding that constitute and objectify empirical phenomena (on systematicity as a distinctive aspect of this ideal and on its origin in reason, see Kitcher 1986 and Hoyningen-Huene 2013).

Kant’s ideas set the frame of reference for discussions of the unification of the sciences in German thought throughout the nineteenth century (Wood and Hahn 2011). He gave philosophical currency to the notion of worldview (Weltanschauung) and, indirectly, world-picture (Weltbild), establishing among philosophers and scientists the notion of unity of science as an intellectual ideal. From Kant, German-speaking Philosophers of Nature adopted the image of Nature in terms of interacting forces or powers and developed it in different ways; this image found its way to British natural philosophy. In Great Britain this idealist, unifying spirit (and other notions of an idealist and romantic turn) was articulated in William Whewell’s philosophy of science. Two unifying dimensions are these: his notion of mind-constructed fundamental ideas, which form the basis for organizing axioms and phenomena and classifying sciences, and the argument for the reality of explanatory causes in the form of consilience of induction, wherein a single cause is independently arrived at as the hypothesis explaining different kinds of phenomena.

In face of expanding researches, the unifying emphasis on organization, classification and foundation led to exploring differences and rationalizing boundaries. The German intellectual current culminated in the late nineteenth century in the debates among philosophers such as Windelband, Rickert and Dilthey. In their views and those of similar thinkers, a worldview often included elements of evaluation and life meaning. Kant had established the basis for the famous distinction between the natural sciences (Naturwissenschaften) and the cultural, or social, sciences (Geisteswissesnschaften) popularized in theory of science by Wilhelm Dilthey and Wilhelm Windelband. Dilthey, Windelband, his student Heinrich Rickert and Max Weber (although the first two preferred Kulturwissenschaften, which excluded psychology) debated over how differences in subject matter between the two kinds of sciences forced a distinctive difference between their respective methods. Their preoccupation with the historical dimension of the human phenomena, along with the Kantian emphasis on the conceptual basis of knowledge, led to the suggestion that the natural sciences aimed at generalizations about abstract types and properties, whereas the human sciences studied concrete individuals and complexes. The human case suggested a different approach based on valuation and personal understanding (Weber’s verstehen). For Rickert, individualized concept formation secured knowledge of historical individuals by establishing connections to recognized values (rather than personal valuations). In biology, Ernst Haeckel defended a monistic worldview (Richards 2008).

This approach stood in opposition to the prevailing empiricist views that, since the time of Hume, Comte and Mill, held that the moral or social sciences (even philosophy) relied on conceptual and methodological analogies with geometry and the natural sciences, not just astronomy and mechanics, also with biology. In the Baconian tradition, Comte emphasized a pyramidal hierarchy of disciplines in his “encyclopedic law” or order, from the most general sciences about the simplest phenomena to the most specific sciences about the most complex phenomena, each depending on knowledge from its more general antecedent: from inorganic physical sciences (arithmetic, geometry, mechanics, astronomy, physics and chemistry) to the organic physical ones, such as biology and the new “social physics”, soon to be renamed sociology (Comte 1830–1842). Mill, instead, pointed to the diversity of methodologies for generating, organizing and justifying associated knowledge with different sciences, natural and human, and the challenges to impose a single standard (Mill 1843, Book VI). He came to view political economy eventually as an art, a tool for reform more than a system of knowledge (Snyder 2006).

The Weltbild tradition influenced the physicists Max Planck and Ernst Mach, who engaged in a heated debate about the precise character of the unified scientific world-picture. Mach’s more influential view was both phenomenological and Darwinian: the unification of knowledge took the form of an analysis of ideas into biologically embodied elementary sensations (neutral monism) and was ultimately a matter of adaptive economy of thought. Planck adopted a realist view that took science to gradually approach complete truth about the world, and fundamentally adopted the thermodynamical principles of energy and entropy (on the Mach-Planck debate see Toulmin 1970). These world-pictures constituted some of the alternatives to a long-standing mechanistic view that, since the rise of mechanistic philosophy with Descartes and Newton, had informed biology as well as most branches of physics. In the background was the perceived conflict between the so-called mechanical and electromagnetic worldviews, which resulted throughout the first two decades of the twentieth century in the work of Albert Einstein (Holton 1998).

In the same German tradition and amidst the proliferation of work on energy physics and books on unity of science, the German energeticist Wilhelm Ostwald declared the 20th century the “Monistic century”. During the 1904 World’s Fair in St. Louis, the German psychologist and Harvard professor Hugo Munsterberg organized a Congress under the title “Unity of Knowledge”; invited speakers were Ostwald, Ludwig Boltzmann, Ernest Rutherford, Edward Leamington Nichols, Paul Langevin and Henri Poincaré. In 1911 the International Committee of Monism held its first meeting in Hamburg, with Ostwald presiding.[1] Two years later it published Ostwald’s monograph, Monism as the Goal of Civilization. In 1912, Mach, Felix Klein, David Hilbert, Einstein, and others signed a manifesto aiming at the development of a comprehensive world-view. Unification remained a driving scientific ideal. In the same spirit, Mathieu Leclerc du Sablon published his L’Unité de la Science (1919), exploring metaphysical foundations, and Johan Hjorst published The Unity of Science (1921), sketching out a history of philosophical systems and unifying scientific hypotheses.

### 1.4 Unity and reductionism in logical empiricism

The question of unity engaged science and philosophy alike. In the 20th century the unity of science became a distinctive theme of the scientific philosophy of logical empiricism. Logical empiricists—known controversially also as logical positivists—and most notably the founding members of the Vienna Circle in their Manifesto adopted the Machian banner of “unity of science without metaphysics”, a normative criterion of unity with a role in social reform based on the demarcation between science and metaphysics: the unity of method and language that included all the sciences, natural and social. A common method did not necessarily imply a more substantive unity of content involving theories and their concepts.

A stronger reductive model within the Vienna Circle was recommended by Rudolf Carnap in his The Logical Construction of the World (1928). While embracing the Kantian connotation of the term “constitutive system”, it was inspired by recent formal standards: Hilbert’s axiomatic approach to formulating theories in the exact sciences and Frege’s and Russell’s logical constructions in mathematics. It was also predicated on the formal values of simplicity, rationality, (philosophical) neutrality and objectivity associated with scientific knowledge. In particular, Carnap tried to explicate such notions in terms of a rational reconstruction of science in terms of a method and a structure based on logical constructions out of (1) basic concepts in axiomatic structures and (2) rigorous, reductive logical connections between concepts at different levels.

Different constitutive systems or logical constructions would serve different (normative) purposes: a theory of science and a theory of knowledge. Both foundations raised the issue of the nature and universality of a physicalist language.

One such systems of unified science is the theory of science, in which the construction connects concepts and laws of the different sciences at different levels, with physics—with its genuine laws—as fundamental, lying at the base of the hierarchy. Because of the emphasis on the formal and structural properties of our representations, objectivity, rationality and unity go hand in hand. Carnap’s formal emphasis developed further in Logical Syntax of Language (1934). Alternatively, all scientific concepts could be constituted or constructed in a different system in the protocol language out of classes of elementary complexes of experiences, scientifically understood, representing experiential concepts. Carnap subsequently defended the epistemological and methodological universality of physicalist language and physicalist statements. Unity of science in this context was an epistemological project (for a survey of the epistemological debates, see Uebel 2007; on different strands of the anti-metaphysical normative project of unity see Frost-Arnold 2005).

Whereas Carnap aimed at rational reconstructions, another member of the Vienna Circle, Otto Neurath, favored a more naturalistic and pragmatic approach, with a less idealized and reductive model of unity. His evolving standards of unity were generally motivated by the complexity of empirical reality and the application of empirical knowledge to practical goals. He spoke of an “encyclopedia-model”,opposed to the classic ideal of a pyramidal, reductive “system-model”. The encyclopedia-model took into account the presence within science of uneliminable and imprecise terms from ordinary language and the social sciences and emphasized a unity of language and the local exchanges of scientific tools. Specifically, Neurath stressed the material-thing-language called “physicalism”, not to be confounded with the emphasis on the vocabulary of physics. its motivation was partly epistemological and Neurath endorsed anti-foundationalism: No unified science, like a boat at sea, would rest on firm foundations. The scientific spirit abhorred dogmatism. This weaker model of unity emphasized empiricism and the normative unity of the natural and the human sciences.

Like Carnap’s unified reconstructions, Neurath’s had pragmatic motivations. Unity without reductionism provided a tool for cooperation and it was motivated by the need for successful treatment—prediction and control—of complex phenomena in the real world that involved properties studied by different theories or sciences (from real forest fires to social policy): unity of science at the point of action (Cat, Cartwright and Chang 1996). It is an argument from holism, the counterpart of Duhem’s claim that only clusters of hypotheses are confronted with experience. Neurath spoke of a “boat”, a “mosaic”, an “orchestration”, and a “universal jargon”. Following institutions such as the International Committee on Monism and the International Council of Scientific Unions, Neurath spearheaded a movement for Unity of Science in 1934 that encouraged international cooperation among scientists and launched the project of an International Encyclopedia of Unity of Science. It expressed the internationalism of his socialist convictions and the international crisis that would lead to the Second World War (Kamminga and Somsen 2016).

At the end of the Eighth International Congress of Philosophy, held in Prague in September of 1934, Neurath proposed a series of International Congresses for the Unity of Science. These took place in Paris, 1935; Copenhagen, 1936; Paris, 1937; Cambridge, England, 1938; Cambridge, Massachusetts, 1939 and Chicago, 1941. For the organization of the congresses and related activities, Neurath founded the Unity of Science Institute in 1936, which was renamed in 1937 as the International Institute for the Unity of Science, alongside the International Foundation for Visual Education, founded in 1933. The Institute’s executive committee was composed of Neurath, Philip Frank and Charles Morris.

After the Second World War, a discussion of unity engaged philosophers and scientists in the Inter-Scientific Discussion Group, first the Science of Science Discussion Group, in Cambridge, Massachusetts, (founded primarily by Philip Frank and Carnap, themselves founders of the Vienna Circle, Quine, Feigl, Bridgman, and the psychologists E. Boring and S.S. Stevens in October 1940) which would later become the Unity of Science Institute. The group was joined by scientists from different disciplines, from quantum mechanics (Kemble and Van Vleck) and cybernetics (Wiener) to economics (Morgenstern), as part of what was both a self-conscious extension of the Vienna Circle and a reflection of local concerns within a technological culture increasingly dominated by the interest in computers and nuclear power. The characteristic feature of the new view of unity was the ideas of consensus and subsequently, especially within the USI, cross-fertilization. These ideas were instantiated in the emphasis on scientific operations (operationalism) and the creation of war-boosted cross-disciplines such as cybernetics, computation, electro-acoustics, psycho-acoustics, neutronics, game theory, and biophysics (Galison 1998 and Hardcastle 2003).

In the late 1960s, Michael Polanyi and Marjorie Grene organized a series of conferences funded by the Ford Foundation on unity of science themes (Grene 1969a, 1969b, 1971). Their general character was interdisciplinary and anti-reductionist. The group was originally called “Study Group on Foundations of Cultural Unity,” but this was later changed to “Study Group on the Unity of Knowledge.” By then, a number of American and international institutions were already promoting interdisciplinary projects in academic areas (Klein 1990). For both Neurath and Polanyi the organization of knowledge and science, the Republic of Science, was inseparable from ideals of political organization.

## 2. Varieties of Unity

The historical introductory sections have aimed to show the intellectual centrality, varying formulations, and significance of the concept of unity. The rest of the entry presents a variety of modern themes and views. It will be helpful to introduce a number of broad categories and distinctions that can sort out different kinds of accounts and track some relations between them as well as additional significant philosophical issues. (The categories are not mutually exclusive, and they sometimes partly overlap; therefore; while they help label and characterize different positions, they cannot provide a simple, easy and neatly ordered conceptual map.)

Connective unity is a weaker notion than the specific ideal of reductive unity; this requires asymmetric relations of reduction, with assumptions about hierarchies of levels of description and the primacy—conceptual, ontological, epistemological, and so on—of a fundamental representation. The category of connective unity helps accommodate and bring attention to the diversity of non-reductive accounts.

Another useful distinction is between synchronic and diachronic unity. Synchronic accounts are ahistorical, assuming no meaningful temporal relations. Diachronic accounts, by contrast, introduce genealogical hypotheses involving asymmetric temporal and causal relations between entities or states of the systems described. Evolutionary models are of this kind; they may be reductive to the extent that the posited original entities are simpler and on a lower level of organization and size. Others simply emphasize connection without overall directionality.

In general, it is useful to distinguish between ontological unity and epistemological unity, even if many accounts bear both characteristics and fall under both rubrics. In some cases, one kind supports the other salient kind in the model. Ontological unity is here broadly understood as involving relations between descriptive conceptual elements; in some cases the concepts will describe entities, facts, properties or relations, and descriptive models will focus on metaphysical aspects of the unifying connections such as holism, emergence, or downwards causation. Epistemological unity applies to epistemic relations or goals such as explanation. Methodological connections and formal (logical, mathematical, etc.) models may belong in this kind. I will not draw any strict or explicit distinction between epistemological and methodological dimensions or modes of unity.

Additional categories and distinctions include the following: vertical unity or inter-level unity is unity of elements attached to levels of analysis, composition or organization on a hierarchy, whether for a single science or more, whereas horizontal unity or intra-level unity applies to one single level and to its corresponding kind of system (Wimsatt 2007). Global unity is unity of any other variety with a universal quantifier of all kinds of elements, aspects or descriptions associated with individual sciences as a kind of monism, for instance, taxonomical monism about natural kinds, while local unity applies to a subset (Cartwright has distinguished this same-level global form of reduction, or "imperialism", in Cartwright 1999; see also Mitchell 2003). Obviously, vertical and horizontal accounts of unity can be either global or local. Finally, the rejection of global unity has been associated with isolationism, keeping independent competing alternative representations of the same phenomena or systems, as well as local integration, the local connective unity of the alternative perspectives. A distinction of methodological nature contrasts internal and external perspectives, according to whether the accounts are based naturalistically, on the local contingent practices of certain scientific communities at a given time, or based on universal metaphysical assumptions broadly motivated (Ruphy 2017). (Ruphy has criticized Cartwright and Dupré for having adopted external metaphysical positions and defended the internal perspective, also present in the program of the so-called Minnesota School, i.e., Kellert et al. 2006.)

## 3. Epistemological Unities

### 3.1 Reduction

Philosophy of science became professionally consolidated in the 1950s around a positivist orthodoxy that may be characterized by the following set of commitments: a syntactic formal approach to theories, logical deductions and axiomatic systems, a distinction between theoretical and observational vocabularies, and empirical generalizations. Unity and especially reduction have been understood in those terms; specific elements of the dominating accounts would stand and fall with the attitudes towards the elements of the orthodoxy mentioned above. First, a reminder: Reductionism must be distinguished from reduction. Reductionism is the adoption of reduction as the global ideal of a unified structure of scientific knowledge and a measure of its progress towards that ideal. As before, I will consider methodological aspects of unity as an extension of epistemological matters, insofar as methodology serves epistemology.

Two formulations of unification in the logical positivist tradition of the ideal logical structure of science placed the question of unity at the core of philosophy of science: Carl Hempel’s deductive-nomological model of explanation and Ernst Nagel’s model of reduction. Both are fundamentally epistemological models, and both are specifically explanatory, at least in the sense that explanation serves unification. The emphasis on language and logical structure makes explanatory reduction a form of unity of the synchronic kind. Still, Nagel’s model of reduction is a model of scientific structure and explanation as well as of scientific progress. It is based on the problem of relating different theories as different sets of theoretical predicates.

Reduction requires two conditions: connectability and derivability. Connectability of laws of different theories requires meaning invariance in the form of extensional equivalence between descriptions, with bridge principles between coextensive but distinct terms in different theories.

Nagel’s account distinguishes two kinds of reductions: homogenous and heterogeneous. When both sets of terms overlap, the reduction is homogeneous. When the related terms are different, the reduction is heterogeneous. Derivability requires a deductive relation between the laws involved. In the quantitative sciences, the derivation often involved taking a limit. In this sense the reduced science is considered an approximation to the reducing new one.

Neo-Nagelian accounts have attempted to solve Nagel’s problem of reduction between putatively incompatible theories. Here are a few:

Nagel’s two-term relation account has been modified by weaker conditions of analogy and a role for conventions, requiring it to be satisfied not necessarily by the two original theories, $$T_1$$ and $$T_2$$, which are respectively new and old and more and less general, but by the modified theories $$T'_1$$ and $$T'_2$$. Explanatory reduction is strictly a four-term relation in which $$T'_1$$ is “strongly analogous” to $$T_1$$ and corrects, with the insight that the more fundamental theory can offer, the older theory, $$T_2$$, changing it to $$T'_2$$. Nagel’s account also requires that bridge laws be synthetic identities, in the sense that they be factual, empirically discoverable and testable; in weaker accounts, admissible bridge laws may include elements of convention (Schaffner 1967; Sarkar 1998). The difficulty lay especially with the task of specifying or giving a non-contextual, transitive account of the relations between $$T$$ and $$T'$$ (Wimsatt 1976).

An alternative set of semantic and syntactic conditions of reduction bears counterfactual interpretations. For instance, syntactic conditions in the form of limit relations and ceteris paribus assumptions help explain why the reduced theory works where it does and fails where it does not (Glymour 1969).

A different approach to reductionism acknowledges a commitment to providing explanation but rejects the value of a focus on the role of laws. This approach typically draws a distinction between hard sciences such as physics and chemistry and special sciences such as biology and the social sciences. It claims that laws that are in a sense operative in the hard sciences are not available in the special ones, or play a more limited and weaker role, and this on account of historical character, complexity or reduced scope. The rejection of empirical laws in biology, for instance, has been argued on grounds of historical dependence on contingent initial conditions (Beatty 1995), and as matter of supervenience (see the entry on supervenience) of spatio-temporally restricted functional claims on lower level molecular ones, and the multiple realization (see the entry on multiple realizability) of the former by the latter (Rosenberg 1994; Rosenberg’s argument from supervenience to reduction without laws must be contrasted with Fodor’s physicalism about the special sciences about laws without reduction (see below and the entry on physicalism); for a criticism of these views see Sober 1996). This non-Nagelian approach assumes further that explanation rests on identities between predicates and deductive derivations (reduction and explanation might be said to be justified by derivations, but not constituted by them; see Spector 1978). Explanation is provided by lower-level mechanisms; their explanatory role is to replace final why-necessarily questions (functional) with proximate how-possibly questions (molecular).

One suggestion to make sense of the possibility of the supervening functional explanations without Nagelian reduction is a metaphysical picture of composition of powers in explanatory mechanisms (Gillette 2010). The reductive commitment to the lower level is based on relations of composition, at play in epistemological analysis and metaphysical synthesis, but is merely formal and derivational. We infer what composes the higher level but we cannot simply get all the relevant knowledge of the higher level from our knowledge of the lower level (see also Auyang 1998).

A more general characterization views reductionism as a research strategy. On this methodological view reductionism can be characterized by a set of so-called heuristics (non-algorithmic, efficient, error-based, purpose-oriented, problem-solving tasks) (Wimsatt 2006): heuristics of conceptualization (e.g., descriptive localization of properties, system-environment interface determinism, level and entity-dependence), heuristics of model-building and theory construction (e.g., model intra-systemic localization with emphasis of structural properties over functional ones, contextual simplification and external generalization) and heuristics of observation and experimental design (e.g., focused observation, environmental control, local scope of testing, abstract shared properties, behavioral regularity and context-independence of results).

### 3.2 Antireductionism

The focus had been since the 1930s on a syntactic approach, with physics as the paradigm of science, deductive logical relations as the form of cognitive or epistemic goals such as explanation and prediction, and theory and empirical laws as paradigmatic units of scientific knowledge (Suppe 1977; Grünbaum and Salmon 1988). The historicist turn in the 1960s, the semantic turn in philosophy of science in the 1970s and a renewed interest in special sciences has changed this focus. The very structure of hierarchy of levels has lost its credibility, even for those who believe in it as a model of autonomy of levels rather than as an image of fundamentalism. The rejection of such models and their emendations have occupied the last four decades of philosophical discussion about unity in and of the sciences (especially in connection to psychology and biology, and more recently chemistry). A valuable consequence has been the strengthening of philosophical projects and communities devoting more sustained and sophisticated attention to special sciences, different from physics.

The first target of antireductionist attacks has been Nagel’s demand of extensional equivalence. It has been dismissed as an inadequate demand of “meaning invariance” and approximation, and with it the possibility of deductive connections. Mocking the positivist legacy of progress through unity, empiricism and anti-dogmatism, these constraints have been decried as intellectually dogmatic, conceptually weak and methodologically overly restrictive (Feyerabend 1962). The emphasis is placed, instead, on the merits of the new theses of incommensurability and methodological pluralism.

A similar criticism of reduction involves a different move: that the deductive connection be guaranteed provided that the old, reduced theory was “corrected” beforehand (Shaffner 1967). The evolution and the structure of scientific knowledge could be neatly captured, using Schaffner’s expression, by “layer-cake reduction.” The terms “length” and “mass”—or the symbols $$l$$ and $$m$$—, for instance, may be the same in Newtonian and Relativistic mechanics, or the term “electron” the same in classical physics and quantum mechanics, or the term “atom” the same in quantum mechanics and in chemistry, or “gene” in Mendelian genetics and molecular genetics (see, for instance, Kitcher 1984). But the corresponding concepts, they argued, are not. Concepts or words are to be understood as getting their content or meaning within a holistic or organic structure, even if the organized wholes are the theories that include them. From this point of view, different wholes, whether theories or Kuhnian paradigms, manifest degrees of conceptual incommensurability. As a result, the derived, reducing theories typically are not the allegedly reduced, older ones; and their derivation sheds no relevant insight into the relation between the original, older one and the new (Feyerabend 1962; Sklar 1967).

From a historical standpoint, the positivist model collapsed the distinction between synchronic and diachronic reduction, that is, between reductive models of the structure and the evolution, or succession, of scientific theories. By contrast, historicism, as embraced by Kuhn and Feyerabend, drove a wedge between the two dimensions and rejected the linear model of scientific change in terms of accumulation and replacement. For Kuhn, replacement becomes partly continuous, partly non-cumulative change in which one world—or, less literally, one world-picture, one paradigm—replaces another (after a revolutionary episode of crisis and proliferation of alternative contenders) (Kuhn 1962). This image constitutes a form of pluralism, and, like the reductionism it is meant to replace, it can be either synchronic or diachronic. Here is where Kuhn and Feyerabend parted ways. For Kuhn synchronic pluralism only describes the situation of crisis and revolution between paradigms. For Feyerabend history is less monistic, and pluralism is and should remain a synchronic and diachronic feature of science and culture (Feyerabend, here, thought science and society inseparable, and followed Mill’s philosophy of liberal individualism and democracy).

A different kind of antireductionism addresses a more conceptual dimension, the problem of categorial reduction: Meta-theoretical categories of description and interpretation for mathematical formalisms, e.g., criteria of causality, may block full reduction. Basic interpretative concepts that are not just variables in a theory or model are not reducible to counterparts in fundamental descriptions (Cat 2000 and 2006; the case of individuality in quantum physics has been discussed in Healey 1991; Redhead and Teller 1991 and Auyang 1995; in psychology in Block 2003).

### 3.3 Epistemic roles: from demarcation to explanation and evidence. Varieties of connective unity. Aesthetic value

Unity has been considered an epistemic virtue, with different modes of unification associated with roles such as demarcation, explanation and evidence.

Demarcation. Certain models of unity, which we may call container models, attempt to to demarcate science from non-science. The criteria adopted are typically methodological and normative, not descriptive. Unlike connective models, they serve a dual function of drawing up and policing a boundary that (1) encloses and endorses the sciences and (2) excludes other practices. As noted above, some demarcation projects have aimed to distinguish between natural and special sciences. The more notorious ones, however, have aimed to exclude practices and doctrines dismissed under the labels of metaphysics, pseudo-science or popular knowledge. Empirical or not, the applications of standards of epistemic purity are not merely identification or labeling exercises for the sake of carving out scientific inquiry as a natural kind or mapping out intellectual landscapes.The purpose is to establish authority and the stakes involve educational, legal and financial interests. Recent controversies include not just the teaching of creation science, also polemics over the scientific status of, for instance, homeopathy, vaccination and models of plant neurology and climate change.

The most influential demarcation criterion has been Popper’s original anti-metaphysics barrier: the condition of empirical falsifiability of scientific statements. It required the logically possible relation to basic statements, linked to experience, that can prove general hypotheses to be false with certainty. For this purpose he defended the application of a particular deductive argument, the modus tollens (Popper 1935/1951). Another demarcation criterion is explanatory unity, empirically grounded. Hempel’s deductive-nomological model characterizes the scientific explanation of events as a logical argument that expresses their expectability in terms of their subsumption under an empirically testable generalization. Explanations in the historical sciences too must fit the model if they are to count as scientific. They could then be brought into the fold as bona fide scientific explanations even if they could qualify only as explanation sketches.

Since their introduction, Hempel’s model and its weaker versions have been challenged as neither generally applicable not appropriate. The demarcation criterion of unity is undermined by criteria of demarcation between natural and historical sciences. For instance, historical explanations have a genealogical or narrative form, or else they require the historian’s engaging problems or issuing a conceptual judgment that brings together meaningfully a set of historical facts (recent versions of such decades-old arguments are in Cleland 2002, Koster 2009, Wise 2011). According to more radical views, natural sciences such as geology and biology are historical in their contextual, causal and narrative forms; also that Hempel’s model, especially the requirement of empirically testable strict universal laws, is satisfied by neither the physical sciences nor the historical sciences, including archeology and biology (Ereshefsky 1992).

A number of legal decisions have appealed to Popper’s and Hempel’s criteria, adding the epistemic role of peer review, publication and consensus around the sound application of methodological standards. A more recent criterion has sought a different kind of demarcation: it is comparative rather than absolute; it aims to compare science and popular science; it adopts a broader notion of in the German tradition of Wissenschaften, that is, roughly of scholarly fields of research that include formal sciences, natural sciences, human sciences and the humanities; and it emphasizes the role of systematicity, with an emphasis on different forms of epistemic connectedness as weak forms of coherence and order (Hoyningen-Huene 2013).

Explanation. Unity has been defended in the wake of authors such as Kant and Whewell as an epistemic criterion of explanation or at least fulfilling an explanatory role. In other words, rather than modeling unification in terms of explanation, explanation is modeled in terms of unification. A number of proposals introduce an explanatory measure in terms of the number of independent explanatory laws or phenomena conjoined in a theoretical structure. On this representation, unity contributes understanding and confirmation from the fewest basic kinds of phenomena, regardless of explanatory power in terms of derivation or argument patterns (Friedman 1974; Kitcher 1981; Kitcher 1989; Wayne 1996; within a probabilistic framework, Myrvold 2003, Sober 2003 and Roche and Sober 2017; see below).

A weaker position argues that unification is not explanation on the grounds that unification is simply systematization of old beliefs and operates as a criterion of theory-choice (Halonen and Hintikka 1999).

The unification account of explanation has been defended within a more detailed cognitive and pragmatist approach. The key is to think of explanations as question-answer episodes involving four elements: the explanation-seeking question about $$P, P$$?, the cognitive state $$C$$ of the questioner/agent for whom $$P$$ calls for explanation, the answer $$A$$, and the cognitive state $$C+A$$ in which the need for explanation of $$P$$ has disappeared. A related account models unity in the cognitive state in terms of the comparative increase of coherence and elimination of spurious unity—such as circularity or redundancy (Schurz 1999). Unification is also based on information-theoretic transfer or inference relations. Unification of hypotheses is only a virtue if it unifies data. The last two conditions imply that unification yields also empirical confirmation. Explanations are global increases in unification in the cognitive state of the cognitive agent (Schurz 1999; Schurz and Lambert 1994).

The unification-explanation link can be defended on the grounds that laws make unifying similarity expectable (hence Hempel-explanatory) and this similarity becomes the content of a new belief (Weber and Van Dyck 2002 contra Halonen and Hintikka 1999). Unification is not the mere systematization of old beliefs. Contra Schurz they argue that scientific explanation is provided by novel understanding of facts and the satisfaction of our curiosity (Weber and Van Dyck 2002 contra Schurz 1999). In this sense, causal explanations, for instance, are genuinely explanatory and do not require an increase of unification.

A contextualist and pluralist account argues that understanding is a legitimate aim of science that is pragmatic and not necessarily formal, or a subjective psychological by-product of explanation (De Regt and Dieks 2005). In this view explanatory understanding is variable and can have diverse forms, such as causal-mechanical and unification, without conflict (De Regt and Dieks 2005). In the same spirit, Salmon linked unification to the the epistemic virtue or goal of explanation and distinguished between unification and causal-mechanical explanation as forms of scientific explanatory understanding (Salmon 1998).

The views on scientific explanation have evolved away from the formal and cognitive accounts of the epistemic categories. Accordingly, the source of understanding provided by scientific explanations has been misidentified according to some (Barnes 1992). The genuine source for important, but not all, cases lies, in causal explanation, or causal mechanism (Cartwright 1983; Cartwright 1989; see also Glennan 1996, Cat 2005 and Craver 2007). Mechanistic models of explanation have become entrenched in philosophical accounts of the life sciences (Darden 2006, Craven 2007). As an epistemic virtue, the role of unification has been traced to the causal form of the explanation, for instance, in statistical regularities (Schurz 2015). The challenge extends to the alleged extensional link between explanation on the one hand, and truth and universality on the other (Cartwright 1983, Dupré 1993, Woodward 2003). In this sense, explanatory unity, which rests on metaphysical assumptions about components and their properties, also involves a form of ontological or metaphysical unity (for a methodological criticism of external, metaphysical perspectives, see Ruphy 2016).

Similar criticisms extend to the traditionally formalist arguments in physics about fundamental levels; there unification fails to yield explanation in the formal scheme based on laws and their symmetries (Cat 1998; Cat 2005). Unification and explanation conflict on the grounds that in biology and physics only causal mechanical explanations answering why-questions yield understanding of the connections that contribute to “true unification” (Morrison 2000;[2] Morrison’s choice of standard for evaluating the epistemic accounts of unity and explanation and her focus on systematic theoretical connections without reduction has not been without critics, e.g., Wayne 2002; Plutynski 2005, Karaca 2012).[3]

Methodology. Unity has long been understood as a methodological principle, primarily, but not exclusively, in reductionist versions (Wimsatt 1976 and Wimsatt 2006 for the case of biology and Cat 1998 for physics). This is different from the case of unity through methodological prescriptions. One methodological criterion appeals to the epistemic virtues of simplicity or parsimony, whether epistemological or ontological (Sober 2003). As a formal probabilistic principle of curve-fitting or average predictive accuracy, the relevance of unity is objective. Unity plays the role of an empirical background theory.

Evidence. The probabilistic model dovetails with other recent formal discussions of unity and coherence within the framework of Bayesianism (Forster and Sober 1994, sect. 7; Schurz and Lambert 2005 is also a formal model, with an algebraic approach). More generally, the probabilistic framework articulates formal characterizations of unity and introduces its role in evaluations of evidence. As in the dual relation to explanation, also in this case, unification is not a condition for relevant evidence but a criterion of evidence (for a non-probabilistic account of the relation between unification and confirmation, see Schurz 1999). The evidentiary role of unification of hypotheses or models is related, but not reducible, to the evidentiary role of synthesis of data in statistics.

A criterion of unity defended for its epistemic virtue in relation to evidence is simplicity, or parsimony (Sober 2013 and 2016). Comparatively speaking, simpler hypotheses, models or theories present a higher likelihood of truth, empirical support and accurate prediction. From a methodological standpoint, however, appeals to parsimony might not be sufficient. Moreover, the connection between unity as parsimony and likelihood is not interest-relative, at least in the way that the connection between unity and explanation is (Sober 2003; Forster and Sober 1994 and Sober 2013 and 2016).

On the Bayesian approach, the rational comparison and acceptance of probabilistic beliefs in the light of empirical data is constrained by Bayes’ Theorem for conditional probabilities (where $$h$$ and $$d$$ are the hypothesis and the data respectively):

$\P(h \mid d) = \frac{\P(d \mid h) \cdot \P(h)}{P(d)}$

One explicit Bayesian account of unification as an epistemic, methodological virtue, has introduced the following measure of unity: a hypothesis $$h$$ unifies phenomena $$p$$ and $$q$$ to the degree that given $$h, p$$ is statistically/probabilistically relevant to (or correlated with) $$q$$ (Myrvold 2003; a probabilistically equivalent measure of unity in Bayesian terms in McGrew 2003; on the equivalence, Schupbach 2005). This measure of unity has been criticized as neither necessary nor sufficient (Lange 2004; Lange’s criticism assumes the unification-explanation link; in a rebuttal, Schupbach has rejected this and other assumptions behind Lange’s criticism; Schupbach 2005). In a recent development, Myrvold argues for mutual information unification, i.e., that hypotheses are said to be supported by their ability to increase the amount of what he calls the mutual information of the set of evidence statements; see Myrvold 2017. The explanatory unification contributed by hypotheses about common causes is an instance of the information condition.

Finally, another kind of formal model for a different kind of unity straddles the boundary between formal epistemology and ontology: computational models of emergence or complexity. They are based on simulations of chaotic dynamical processes such as cellular automata (Wolfram 1984; Wolfram 2002). Their supposed superiority to combinatorial models based on aggregative functions of parts of wholes does not lack defenders (Crutchfield 1994; Crutchfield and Hanson 1997; Humphreys 2004, 2007 and 2008; Humphreys and Huneman 2008; Huneman 2008a and b and 2010).

Unification without reduction. Reduction is not the sole standard of unity and models of unification without reduction have proliferated. In addition, such models introduce in turn new units of analysis. An early influential account centers around the notion of interfield theories (Darden and Maull 1977; Darden 2006). The orthodox central place of theories as the unit of scientific knowledge is replaced by that of fields. Examples of such fields are genetics, biochemistry and cytology. Different levels of organization correspond in this view to different fields: Fields are individuated intellectually by a focal problem, a domain of facts related to the problem, explanatory goals, methods and a vocabulary. Fields import and transform terms and concepts from others. The model is based on the idea that theories and disciplines do not match neat levels of organization within a hierarchy; rather, many of them in their scope and development cut across different such levels. Reduction is a relation between theories within a field, not across fields.

Interdependence and hybridity. In general, the higher-level theories (for instance, cell physiology) and the lower-level theories (for instance, biochemistry) are ontologically and epistemologically inter-dependent on matters of informational content and evidential relevance; one cannot be developed without the other (Kincaid 1996; Kincaid 1997; Wimsatt 1976; Spector 1977). The interaction between fields (through researchers’ judgments and borrowings) may provide enabling conditions for subsequent interactions. For instance, Maxwell’s adoption of statistical techniques in color research enabled the introduction of similar ideas from social statistics in his research in reductive molecular theories of gases; the reduction, in turn, enabled experimental evidence from chemistry and acoustics; similarly different chemical and spectroscopic bases for colors provided chemical evidence in color research (Cat 2013 and 2014).

The emergence and development of hybrid disciplines and theories are another instance of non-reductive cooperation or interaction between sciences. I noted, above, the post-war emergence of interdisciplinary areas of research, the so-called hyphenated sciences such as neuro-acoustics, radioastronomy, biophysics, etc. (Klein 1990, Galison 1997) On a smaller scale, in the domain of, for instance, physics, one can find semiclassical models in quantum physics or models developed around phenomena where the limiting reduction relations are singular or catastrophic (caustic optics and quantum chaos) (Cat 1998; Batterman 2002; Belot 2005). Such semiclassical explanatory models have not found successful quantum substitutes and have placed structural explanations at the heart of the relation between classical and quantum physics (Bokulich 2008). The general form of pervasive cases of emergence has been characterized with the notion of contextual emergence (Bishop and Atmanspacher 2006): properties, behaviors and their laws on a restricted, lower-level, single-scale, domain are necessary but not sufficient for the properties, behaviors of another, e.g., higher-level one, not even of itself. The latter are also determined by contingent contexts (contingent features of the state space of the relevant system). The interstitial formation of more or less stable small-scale syntheses and cross-boundary “alliances” has been common in most sciences since the early 20th century. Indeed, it is crucial to development in model building and growing empirical relevance in fields ranging anywhere from biochemistry to cell ecology, or from econophysics to thermodynamical cosmology. Similar cases can be found in chemistry and the biomedical sciences

Conceptual unity. The conceptual dimension of cross-cutting has been developed in connection with the possibility of cross-cutting natural kinds that challenges taxonomical monism. Categories of taxonomy and domains of description are interest-relative, as are rationality and objectivity (Khalidi 1998; his view shares positions and attitudes with Longino 1989; Elgin 1996 and 1997). Cross-cutting taxonomic systems, then, are not conceptually inconsistent or inapplicable. Both the interest-relativity and hybridity feature prominently in the context of ontological pluralism (see below).

Another, more general, unifying element of this kind is Holton’s notion of themata. Themata are conceptual values that are a priori yet contingent (both individual and social), informing and organizing presuppositions that factor centrally in the evolution of the science: continuity/discontinuity, harmony, quantification, symmetry, conservation, mechanicism, hierarchy, etc. (Holton 1973). Unity of some kind is itself a thematic element. A more complex and comprehensive unit of organized scientific practice is the notion of the various styles of reasoning, such as statistical, analogical modeling, taxonomical, genetic/genealogical or laboratory styles; each is a cluster of epistemic standards, questions, tools, ontology, and self-authenticating or stabilizing protocols (Hacking 1996; see below for the relevance of this account of a priori elements to claims of global disunity; the account shares distinctive features of Kuhn’s notion of paradigm).

Another model of non-reductive unification is historical and diachronic: it emphasizes the genealogical and historical identity of disciplines, which has become complex through interaction. The interaction extends to relations between specific sciences, philosophy and philosophy of science (Hull 1988). Hull has endorsed an image of science as a process, modeling historical unity after a Darwinian-style pattern of evolution (developing an earlier suggestion by Popper). Part of the account is the idea of disciplines as evolutionary historical individuals, which can be revised with the help of more recent ideas of biological individuality: hybrid unity as an external model of unity as integration or coordination of individual disciplines and disciplinary projects, e.g., characterized by a form of occurrence, evolution or development whose tracking and identification involves a conjunction with other disciplines, projects and domains of resources, from within science or outside science. This diachronic perspective can accommodate models of discovery, in which genealogical unity integrates a variety of resources that can be both theoretical and applied, or scientific and non-scientific (an example, from physics, the discovery of superconductivity, can be found in Holton, Chang and Jurkowitz 1996). Some models of unity below provide further examples.

A generalization of the notion of interfield theories is the idea that unity is interconnection: Fields are unified theoretically and practically (Grantham 2004). This is an extension of the original modes of unity or identity that single out individual disciplines. Theoretical unification involves conceptual, ontological and explanatory relations. Practical unification involves heuristic dependence, confirmational dependence and methodological integration. The social dimension of the epistemology of scientific disciplines relies on institutional unity. With regard to disciplines as professions, this kind of unity has rested on institutional arrangements such as professional organizations for self-identification and self-regulation, university mechanisms of growth and reproduction through certification, funding and training, and communication and record through journals.

Many examples of unity without reduction are local rather than global, and are not merely a phase in a global and linear project or tradition of unification (or integration). They are typically focused on science as a human activity. From that standpoint, unification is typically understood or advocated a piecemeal description and strategy of collaboration (on the distinction between global integration and local interdisciplinarity, see Klein 1990). Cases are restricted to specific models, phenomena or situations.

Material unity. A more recent approach to the connection between different research areas has focused on a material level of scientific practice, with attention to the use of instruments and other material objects (Galison 1997, Bowker and Star 1999). For instance, the material unity of natural philosophy in the 16th and 17th centuries relied on the circulation, transformation and application of objects, in their concrete and abstract representations (Bertoloni-Meli 2006). The latter correspond to the imaginary systems and their representations, which we call models. The evolution of objects and images across different theories and experiments and their developments in 19th-century natural philosophy provide a historical model of scientific development; but the approach is not meant to illustrate reductive materialism, since the same objects and models work and are perceived as vehicles for abstract ideas, institutions, cultures, etc., or prompted by them (Cat 2013). On one view, objects are regarded as elements in so-called trading zones (see below) with shifting meanings in the evolution of 20th-century physics, such as with the cloud chamber which was first relevant to meteorology and next to particle physics (Galison 1997). Alternatively, material objects have been given the status of boundary objects, which provide the opportunity for experts from different fields to collaborate through their respective understanding of the system in question and their respective goals (Bowker and Star 1999).

Graphic unity. At the concrete perceptual level, recent accounts emphasize the role of visual representations in the sciences and suggest what may be called graphic unification of the sciences. Their cognitive roles, methodological and rhetorical, include establishing and disseminating facts and their so-called virtual witnessing, revealing empirical relations, testing their fit with available patterns of more abstract theoretical relations (theoretical integration), suggesting new ones, aiding in computations, serving as aesthetic devices, etc. But these uses are not homogeneous across different sciences and make visible disciplinary differences. We may equally speak of graphic pluralism. The rates in the use of diagrams in research publications appear to vary along the hard-soft axis of pyramidal hierarchy, from physics, chemistry, biology, psychology, economics and sociology and political science (Smith et al. 2000): the highest use can be found in physics, intuitively identified by the highest degree of hardness understood as consensus, codification, theoretical integration and factual stability to highest interpretive and instability of results. Similarly, the same variation occurs among sub-disciplines within each discipline. The kinds of images and their contents also vary across disciplines and within disciplines, ranging from hand-made images of particular specimens to hand-made or mechanically generated images of particulars standing in for types, to schematic images of geometric patterns in space or time, or to abstract diagrams representing quantitative relations. Importantly, graphic tools circulate like other cognitive tools between areas of research that they in turn connect (Galison 1997, Daston and Galison 2007, Lopes 2009; see also Lynch and Woolgar 1990; Baigrie 1996; Jones and Galison 1998; Galison 1997; Cat 2001, 2013 and 2014; and Kaiser 2005).

Disciplinary unity and collaboration. A field of study has focused on disciplines broadly and their relations. Disciplines constitute a broader unity of analysis of connection in the sciences that is characterized, for instance, by their domain of inquiry, cognitive tools and social structure (Bechtel 1987). Unification of disciplines, in that sense, can be interdisciplinary, multidisciplinary, crossdisciplinary and transdisciplinary (Klein 1990, Kellert 2008, Repko 2012). It might involve a researcher borrowing from different disciplines or the collaboration of different researches. Neither modality of connection amounts to a straightforward generalization of, or reduction to any single discipline, theory, etc. In either case, the strategic development is typically defended for its heuristic problem-solving or innovative powers, as it is prompted by a problem considered complex in that it does not arise or cannot be fully treated within the purview of one specific discipline unified or individuated around some potentially non-unique set of elements such as scope of empirical phenomena, rules, standards, techniques, conceptual and material tools, aims, social institutions, etc. Indicators of disciplinary unity may vary (Kuhn 1962, Klein 1990, Kellert 2008). Interdisciplinary research or collaboration creates a new discipline or project, such as interfield research, often leaving the existence of the original ones intact. Multidisciplinary work involves the juxtaposition of the treatments and aims of the different disciplines involved in addressing a common problem. Crossdisciplinary work involves borrowing resources from one discipline to serve the aims of a project in another. Transdisciplinary work is a synthetic creation that encompasses work from different disciplines (Klein 1990, Kellert 2008, Brigandt 2010, Hoffmann, Schmidt and Nersessian 2012, Osbeck et al 2011, Repko 2012). These different modes of synthesis or connection are not mutually exclusive.

Models of interdisciplinary cooperation and their corresponding outcomes are often described using metaphors of different kinds: cartographic (domains, boundaries, trading zone, etc), linguistic (pidgin language, communication, translation, etc), architectural (building blocks, tiles, etc), socio-political (imperialism, hierarchy, republic, orchestration, negotiation, coordination, cooperation etc) or embodied (cross-training). Each selectively highlights and neglects different aspects of scientific practice and properties of scientific products. Cartographic and architectural images, for instance, focus on spatial and static synchronic relations and simply connected, compatible elements. Socio-political and embodied images emphasize activity and non-propositional elements (Kellert 2008 defends the image of cross-training).

In this context, methodological unity often takes the form of borrowing standards and techniques for the application of formal and empirical methods. They range from calculational techniques and tools for theoretical modeling and simulation of phenomena to techniques for modeling of data, use of instruments and conducting experiments (e.g., the culture of field experiments and, more recently, randomized control trials across natural and social sciences). A key element of scientific practice often ignored by philosophical analysis is expertise. As part of different forms of methodological unity, it is key to the acceptance and successful appropriation of techniques. Recent accounts of multidisciplinary collaboration as a human activity have focused on the dynamics of integrating different kinds of expertise around common systems or goals of research (Collins and Evans 2007, Gorman 2002). The same perspective can accommodate the recent interest in so-called mixed methods, e.g., different forms of integration of quantitative and qualitative methods and approaches in the social sciences.

A general model of local interconnection which has acquired widespread attention and application in different sciences is the anthropological model of trading zone, where hybrid languages and meanings are developed that allow for interaction without straightforward extension of any party’s original language or framework (Galison 1997). Galison has applied this kind of anthropological analysis to the subcultures of experimentation. This strategy aims to explain the strength, coherence and continuity of science in terms of local coordinations of intercalated levels of symbolic procedures and meanings, instruments and arguments.

At the experimental level, instruments, as found objects, acquire new meanings, developments and uses as they bridge over the transitions between theories, observations or theory-laden observations. Instruments and experimental projects in the case of Big Science also bring together, synchronically and interactively, the skills, standards and other resources from different communities, and change each in turn (on interdisciplinary experimentation see also Osbeck et al. 2011). Patterns of laboratory research are shared by the different sciences, not just instruments but general strategies of reconfiguration of human researchers and natural entities researched (Knorr-Cetina 1992), statistical standards (e.g., statistical significance) and ideals of replication. At the same time, attention has been paid to the different ways on which experimental approaches differ among the sciences (Knorr-Cetina 1992, Guala 2005, Weber 2005) but also to how they have been transferred (e.g., field experiments and randomized control trials) or integrated (e.g., mixed methods combining quantitative and qualitative techniques).

Empirical work in sociology and cognitive psychology on scientific collaboration has led to a broader perspective including a number of dimensions of interdisciplinary cooperation, involving identification of conflicts and the setting of sufficient so-called common ground integrators: for instance, shared—pre-existing, revised and newly developed— concepts, terminology, standards, techniques, aims, information, tools, expertise, skills (abstract, dialectical, creative and holistic thinking), cognitive and social ethos (curiosity, tolerance, flexibility, humility, receptivity, reflexivity, honesty, team-play) social interaction, institutional structures and geography (Cummings and Kiesler 2005, Klein 1990, Kockelmans 1979, Repko 2012). Sociological studies of scientific collaboration can in principle place the connective models of unity within the more general scope of social epistemology, for instance, in relation to distributive cognition (beyond the focus on strategies of consensus within communities).

The broad and dynamical approach to processes of interdisciplinary integration may effectively be understood to describe the production of different sorts and degrees of epistemic emergence. The integrated accounts require shared (old or new) assumptions and may involve a case of ontological integration, for instance in causal models. Suggested kinds of interdisciplinary causal-model integration are the following: sequential causal order in a process or mechanism cutting across disciplinary divides; horizontal parallel integration of different causal models of different elements of a complex phenomenon; horizontal joint causal model of the same effect; and vertical or cross-level causal integration (see emergent or top-down causality, below) (Repko 2012, Kockelmans 1979).

Talk of cooperation and coordination for the purpose of forming hybrid cross-disciplines, emergent disciplines or projects and products revolves often around two issues: conflicts and the challenge of striking a balance between cooperation and autonomy. By extension of the discussion of value conflict in moral and political philosophy, one must acknowledge the extent to which scientific practice is based on accepting limited conflict over necessary commitments and making epistemic and/or non-epistemic compromises (a volitional, not just cognitive aspect; on this view against unity as social consensus, see Rescher 1993, Cat 2005 and 2010; van Bouwel 2009; comp Repko 2012; Hoffmann, Schmidt and Nersessian 2012).

Aesthetic value. Finally, epistemic values of unity may rely on subsidiary considerations of aesthetic value. Nevertheless, consideration of beauty, elegance or harmony may also provide autonomous grounds for adopting or pursuing varieties of unification in terms of simplicity and patterns of order (regularity of specific relations) (McAllister 1996, Glynn 2010 and Orrell 2012). Whether aesthetic judgements have any epistemic import depends on metaphysical, cognitive or pragmatic assumptions.

## 4. Ontological unities

### 4.1 Ontological unities and reduction

Since Nagel’s influential model of reduction by derivation most discussions of unity of science have been cast in terms of reductions between concepts, the entities they describe, and between theories incorporating the descriptive concepts. Ontological unity is expressed by a preferred set of such ontological units. In terms of concepts featured in preferred descriptions, explanatory or not, reduction endorses taxonomical monism, a privileged set of kinds of things. These privileged kinds are often known as so-called natural kinds, although the notion admits of multiple interpretations, ranging from the more conventionalist to the more essentialistic. Regardless, the fundamental units are ambiguous with respect to their status as either entity or property. Reduction may determine the fundamental kinds or level through the analysis of entities. A distinctive ontological model is this: The hierarchy of levels of reduction is fixed by part-whole relations. The levels of aggregation of entities run all the way down to atomic particles and field parts, rendering microphysics the fundamental science.

A classic reference in this kind, away from the syntactic model, is Oppenheim and Putnam’s “The Unity of Science as a Working Hypothesis” (Oppenheim and Putnam 1958; Oppenheim and Hempel had worked in the 1930s on taxonomy and typology, a question of broad intellectual, social and political relevance in Germany at the time). Oppenheim and Putnam intended to articulate an idea of science as a reductive unity of concepts and laws to those of the most elementary elements. They also defended it as an empirical hypothesis—not an a priori ideal, project or precondition—about science. Moreover, they claimed that its evolution manifested a trend in that unified direction out of the smallest entities and lowest levels of aggregation. In an important sense, the evolution of science recapitulates, in the reverse, the evolution of matter, from aggregates of elementary particles to the formation of complex organisms and species (we find a similar assumption in Weinberg’s downward arrow of explanation). Unity, then, is manifested not just in mereological form, but also diachronically, genealogically or historically.

A weaker form of ontological reduction advocated for the biomedical sciences with the causal notion of partial reductions: explanations of localized scope (focused on parts of higher-level systems only) laying out a causal mechanism connecting different levels in the hierarchy of composition and organization (Schaffner 1993; Schaffner 2006; Scerri has similarly discussed degrees of reduction in Scerri 1994). An extensional, domain-relative approach introduces the distinction between “domain preserving” and “domain combining” reductions. Domain-preserving reductions are intra-level reductions and occur between $$T_1$$ and its predecessor $$T_2$$. In this parlance, however, $$T_2$$ “reduces” to $$T_1$$. This notion of “reduction” does not refer to any relation of explanation (Nickles 1973).

The claim that reduction, as a relation of explanation, needs to be a relation between theories or even involve any theory has also been challenged. One such challenge focuses on “inter-level” explanations in the form of compositional redescription and causal mechanisms (Wimsatt 1976). The role of biconditionals or even Schaffner-type identities, as factual relations, is heuristic (Wimsatt 1976). The heuristic value extends to the preservation of the higher-level, reduced concepts, especially for cognitive and pragmatic reasons, including reasons of empirical evidence. This amounts to rejecting the structural, formal approach to unity and reductionism favored by the logical-positivist tradition. Reductionism is another example of the functional, purposive nature of scientific practice. The metaphysical view that follows is a pragmatic and non-eliminative realism (Wimsatt 2006). As a heuristic, this kind of non-eliminative pragmatic reductionism is a complex stance. It is, across levels, integrative and intransitive, compositional, mechanistic and functionally localized, approximative and abstractive. It is bound to adopting false idealizations, focusing on regularities and stable common behavior, circumstances and properties. It is also constrained in its rational calculations and methods, tool-binding, and problem-relative. The heuristic value of eliminative inter-level reductions has been defended as well (Poirier 2006).

The appeal to formal laws and deductive relations is dropped for sets of concepts or vocabularies in the replacement analysis (Spector 1978). This approach allows for talk of entity reduction or branch reduction, and even direct theory replacement without the operation of laws, and circumvents vexing difficulties raised by bridge principles and the deductive derivability condition (self-reduction, infinite regress, etc). Formal relations only guarantee, but do not define, the reduction relation. Replacement functions are meta-linguistic statements. Like Sellars had argued in the case of explanation, this account distinguishes between reduction and testing of reduction, and highlights the role of derivations in both. Finally, replacement can be in practice or in theory. Replacement in practice does not advocate elimination of the reduced or replaced entities or concepts (Spector 1978).

Note, however, the following: the compartmentalization of theories and their concepts or vocabulary into levels neglects the existence of empirically meaningful and causally explanatory relations between entities or properties at different levels. If they are neglected as theoretical knowledge and left outside as only bridge principles, the possibility of completeness of knowledge is jeopardized. Maximizing completeness of knowledge here requires a descriptive unity of all phenomena at all levels and anything between these levels. Any bounded region or body of knowledge neglecting such cross-boundary interactions is radically incomplete, and not just confirmationally or evidentially so; we may refer to this problem as the problem of cross-boundary incompleteness as either intra-level or horizontal incompleteness and, on a hierarchy, the problem of inter-level or vertical incompleteness (Kincaid 1997; Cat 1998).

The most radical form of reduction as replacement is often called eliminativism. The position has made a considerable impact in philosophy of psychology and philosophy of mind (Churchland 1981; Churchland 1986). On this view the vocabulary of the reducing theories (neurobiology) eliminates and replaces that of the reduced ones (psychology), leaving no substantive relation between them (which is only a replacement rule) (see also eliminative materialism).

In a general semantic account, Sarkar distinguishes different kinds of reduction in terms of four criteria, two epistemological and two ontological: fundamentalism, approximation, abstract hierarchy and spatial hierarchy. Fundamentalism implies that the features of a system can be explained in terms only of factors and rules from another realm. Abstract hierarchy is the assumption that the representation of a system involves a hierarchy of levels of organization with the explanatory factors being located at the lower levels. Spatial hierarchy is a special case of abstract hierarchy in which the criterion of hierarchical relation is a spatial part-whole or containment relation. Strong reduction satisfies the three “substantive” criteria, whereas weak reduction only satisfies fundamentalism. Approximate reductions—strong and hierarchical—are those which satisfy the criterion of fundamentalism only approximately (Sarkar 1998; the merit of Sarkar’s proposal resides in its systematic attention to hierarchical conditions and, more originally, to different conditions of approximation; see also Ramsey 1995; Lange 1995; Cat 2005).

The semantic turn extends to more recent notion of models that do not fall under the strict semantic or model-theoretic notion of mathematical structures (Giere 1999; Morgan and Morrison 1999; Cat 2005). This is a more flexible framework about relevant formal relations and the scope of relevant empirical situations; and it is implicitly or explicitly adopted by most accounts of unity without reduction. One may add the primacy of temporal representation and temporal parts, temporal hierarchy or temporal compositionality, first emphasized by Oppenheim and Putnam as a model of genealogical or diachronic unity. This framework applies to processes both of evolution and development (a more recent version in McGivern 2008 and Love and Hütteman 2011).

The shift in the accounts of scientific theory from syntactic to semantic approaches has changed conceptual perspectives and, accordingly, formulations and evaluations of reductive relations and reductionism. However, examples of the semantic approach focusing on mathematical structures and satisfaction of set-theoretic relations have focused on syntactic features—including the axiomatic form of a theory—in the discussion of reduction (Sarkar 1998, da Costa and French 2003). In this sense, the structuralist approach can be construed as a neo-Nagelian account, while an alternative line of research has championed the more traditional structuralist semantic approach (Balzer and Moulines 1996; Moulines 2006; Ruttkamp 2000; Ruttkamp and Heidema 2005).

### 4.2 Ontological unities and antireductionism

Headed in the opposite direction, arguments concerning new concepts such as multiple realizability and supervenience, introduced by Putnam, Kim, Fodor and others, have led to talk of higher-level functionalism, a distinction between type-type and token-token reductions and the examination of its implications. The concepts of emergence, supervenience and downward causation are related metaphysical tools for generating and evaluating proposals about unity and reduction in the sciences. This literature has enjoyed its chief sources and developments in general metaphysics and in philosophy of mind and psychology (Davidson 1969; Putnam 1975; Fodor 1975; Kim 1993).

Supervenience, first introduced by Davidson in discussions of mental properties, is the notion that a system with properties on one level is composed of entities on a lower level and that its properties are determined by the properties of the lower-level entities or states. The relation of determination is that no changes at the higher-level occur without changes at the lower level. Like token-reductionism, supervenience has been adopted by many as the poor man’s reductionism (see the entry on supervenience). A different case for the autonomy of the macrolevel is based on the notion of multiple supervenience (Kincaid 1997; Meyering 2000).

The autonomy of the special sciences from physics has been defended in terms of a distinction between type-physicalism and token-physicalism (Fodor 1974; Fodor countered Oppenheim and Putnam’s hypothesis under the rubric “the disunity of science”; the entry on physicalism). The key logical assumption is the type-token distinction, that types are realized by more specific tokens, e.g., the type animal is instantiated by different species, the type tiger or electron can be instantiated by multiple individual token tigers and electrons. Type-physicalism is characterized by a type-type identity between the predicates/properties in the laws of the special sciences and those of physics. By contrast, token-physicalism is based on the token-token identity between the predicates/properties of the special sciences and those of physics; every event under a special law falls under a law of physics and bridge laws express contingent token-identities between events. Token-physicalism operates as a demarcation criterion for materialism. Fodor argued that the predicates of the special sciences correspond to infinite or open-ended disjunctions of physical predicates, and these disjunctions do not constitute natural kinds identified by an associated law. Token-physicalism is the only alternative. All special kinds of events are physical but the special sciences are not physics (for criticisms based on the presuppositions in Fodor’s argument, see Sober 1999).

The denial of remedial, weaker forms of reductionism is the basis for the concept of emergence (Humphreys 1997, Bedau and Humphreys 2008). Different accounts have attempted to articulate the idea of a whole being different from or more than the mere sum of its parts (see the entry on emergent properties). Emergence has been described beyond logical relations, synchronically as an ontological property and diachronically as a material process of fusion, in which the powers of the separate constituents lose their separate existence and effects (Humphreys 1997). This concept has been widely applied in discussions of complexity (see below). Unlike the earliest antireductionist models of complexity in terms of holism and cybernetic properties, more recent approaches track the role of constituent parts (Simon 1996). Weak emergence has been opposed to nominal and strong forms of emergence. The nominal kind simply represents that some macro-properties cannot be properties of micro-constituents. The strong form is based on supervenience and irreducibility, with a role for the occurrence of autonomous downwards causation upon any constituents (see below). Weak emergence is linked to processes stemming from the states and powers of constituents, with a reductive notion of downwards causation of the system as a resultant of constituents’ effects; yet the connection is not a matter of Nagelian formal derivation, but of implementation through, for instance, computational aggregation and iteration. Weak emergence, then, can be defined in terms of simulation: a macro-property, state or fact is weakly emergent if and only if it can be derived from its macro-constituents only by simulation (Bedau 2008) (see entry on simulations in science).

Connected to the concept of emergence is top-down or downward causation. It captures the autonomous and genuine causal power of higher-level entities or states, especially upon lower-level ones. The most extreme and most controversial version include a violation of laws that regulate the lower-level (Meehl and Sellars 1956; Campbell 1974). Weaker forms require compatibility with the microlaws (for a brief survey and discussion see Robinson 2005; on downward causation without top-down causes, see Craver and Bechtel 2007, Bishop 2012). The very concept has become the subject of some interdisciplinary interest in the sciences (Ellis, Noble and O’Connor 2012).

Another general argument for the autonomy of the macrolevel in the form of non-reductive materialism has been a cognitive type of functionalism, namely, cognitive pragmatism (Van Gulick 1992). This account links ontology to epistemology. It discusses four pragmatic dimensions of representations: the nature of the causal interaction between theory-user and the theory, the nature of the goals to whose realization the theory can contribute, the role of indexical elements in fixing representational content, and differences in the individuating principles applied by the theory to its types (Wimsatt and Spector’s arguments above are of this kind). A more ontologically substantive account of functional reduction is Ramsey’s bottom-up construction by reduction: transformation reductions streamline formulations of theories in such a way that they extend basic theories upwards by engineering their application to specific context or phenomena. As a consequence, they reveal, by construction, new relations and systems that are antecedently absent from a scientist’s understanding of the theory—independently of a top or reduced theory (Ramsey 1995). A weaker framework of ontological unification is categorial unity, wherein abstract categories such as causality, information, etc, are attached to the interpretation of the specific variables and properties in models of phenomena (see Cat 2000, 2001 and 2006).

## 5. Disunity

A more radical departure from logical-positivist standards of unity is the recent criticism of the methodological values of reductionism and unification in the sciences and also its position in culture and society. From the descriptive standpoint, many views under the rubric of disunity are versions of positions mentioned above. The difference is mainly normative and a matter of emphasis, perspective, and stance. This view argues for the replacement of the emphasis on global unity—including unity of method—by emphasizing disunity and epistemological and ontological pluralism.

### 5.1 The Stanford School

An influential picture of disunity comes from related works by the members of the so-called Stanford School, e.g., John Dupré, Ian Hacking, Peter Galison, Patrick Suppes and Nancy Cartwright. Disunity is, in general terms, a rejection of universalism and uniformity both methodological and metaphysical. While the view can be constructed in terms of specific anti-reductionistic claims and positions, they share an emphasis on the rejection of restrictive accounts of unity. Through their work, the rubric of disunity has acquired a visibility parallel to the one once acquired by unity, as an inspiring philosophical rallying cry.

## 1. Introduction

Reasoning from observations has been important to scientific practice at least since the time of Aristotle who mentions a number of sources of observational evidence including animal dissection (Aristotle(a) 763a/30–b/15, Aristotle(b) 511b/20–25). But philosophers didn’t talk about observation as extensively, in as much detail, or in the way we have become accustomed to, until the 20th century when logical empiricists transformed philosophical thinking about it.

The first transformation was accomplished by ignoring the implications of a long standing distinction between observing and experimenting. To experiment is to isolate, prepare, and manipulate things in hopes of producing epistemically useful evidence. It had been customary to think of observing as noticing and attending to interesting details of things perceived under more or less natural conditions, or by extension, things perceived during the course of an experiment. To look at a berry on a vine and attend to its color and shape would be to observe it. To extract its juice and apply reagents to test for the presence of copper compounds would be to perform an experiment. Contrivance and manipulation influence epistemically significant features of observable experimental results to such an extent that epistemologists ignore them at their peril. Robert Boyle (1661), John Herschell (1830), Bruno Latour and Steve Woolgar (1979), Ian Hacking (1983), Harry Collins (1985) Allan Franklin (1986), Peter Galison (1987), Jim Bogen and Jim Woodward (1988), and Hans-Jörg Rheinberger(1997), are some of the philosophers and philosophically minded scientists, historians, and sociologists of science who gave serious consideration to the distinction between observing and experimenting. The logical empiricists tended to ignore it.

A second transformation, characteristic of the linguistic turn in philosophy, was to shift attention away from things observed in natural or experimental settings and concentrate instead on the logic of observation reports. The shift developed from the assumption that a scientific theory is a system of sentences or sentence like structures (propositions, statements, claims, and so on) to be tested by comparison to observational evidence. Secondly it was assumed that the comparisons must be understood in terms of inferential relations. If inferential relations hold only between sentence like structures, it follows that theories must be tested, not against observations or things observed, but against sentences, propositions, etc. used to report observations. (Hempel 1935, 50–51. Schlick 1935)

Friends of this line of thought theorized about the syntax, semantics, and pragmatics of observation sentences, and inferential connections between observation and theoretical sentences. In doing so they hoped to articulate and explain the authoritativeness widely conceded to the best natural, social and behavioral scientific theories. Some pronouncements from astrologers, medical quacks, and other pseudo scientists gain wide acceptance, as do those of religious leaders who rest their cases on faith or personal revelation, and rulers and governmental officials who use their political power to secure assent. But such claims do not enjoy the kind of credibility that scientific theories can attain. The logical empiricists tried to account for this by appeal to the objectivity and accessibility of observation reports, and the logic of theory testing.

Part of what they meant by calling observational evidence objective was that cultural and ethnic factors have no bearing on what can validly be inferred about the merits of a theory from observation reports. So conceived, objectivity was important to the logical empiricists’ criticism of the Nazi idea that Jews and Aryans have fundamentally different thought processes such that physical theories suitable for Einstein and his kind should not be inflicted on German students. In response to this rationale for ethnic and cultural purging of the German educational system the logical empiricists argued that because of its objectivity, observational evidence, rather than ethnic and cultual factors should be used to evaluate scientific theories.(Galison 1990). Less dramatically, the efforts working scientists put into producing objective evidence attest to the importance they attach to objectivity. Furthermore it is possible, in principle at least, to make observation reports and the reasoning used to draw conclusions from them available for public scrutiny. If observational evidence is objective in this sense , it can provide people with what they need to decide for themselves which theories to accept without having to rely unquestioningly on authorities.

Francis Bacon argued long ago that the best way to discover things about nature is to use experiences (his term for observations as well as experimental results) to develop and improve scientific theories (Bacon1620 49ff). The role of observational evidence in scientific discovery was an important topic for Whewell (1858) and Mill (1872) among others in the 19th century. Recently, Judaea Pearl, Clark Glymour, and their students and associates addressed it rigorously in the course of developing techniques for inferring claims about causal structures from statistical features of the data they give rise to (Pearl, 2000; Spirtes, Glymour, and Scheines 2000). But such work is exceptional. For the most part, philosophers followed Karl Popper who maintained, contrary to the title of one of his best known books, that there is no such thing as a ‘logic of discovery’.(Popper 1959, 31) Drawing a sharp distinction between discovery and justification, the standard philosophical literature devotes most of its attention to the latter.

Although theory testing dominates much of the standard philosophical literature on observation, much of what this entry says about the role of observation in theory testing applies also to its role in inventing, and modifying theories, and applying them to tasks in engineering, medicine, and other practical enterprises.

Theories are customarily represented as collections of sentences, propositions, statements or beliefs, etc., and their logical consequences. Among these are maximally general explanatory and predictive laws (Coulomb’s law of electrical attraction and repulsion, and Maxwellian electromagnetism equations for example), along with lesser generalizations that describe more limited natural and experimental phenomena (e.g., the ideal gas equations describing relations between temperatures and pressures of enclosed gasses, and general descriptions of positional astronomical regularities). Observations are used in testing generalizations of both kinds.

Some philosophers prefer to represent theories as collections of ‘states of physical or phenomenal systems’ and laws. The laws for any given theory are

…relations over states which determine…possible behaviors of phenomenal systems within the theory’s scope. (Suppe 1977, 710)

So conceived, a theory can be adequately represented by more than one linguistic formulation because it is not a system of sentences or propositions. Instead, it is a non-linguistic structure which can function as a semantic model of its sentential or propositional representations. (Suppe 1977, 221–230) This entry treats theories as collections of sentences or sentential structures with or without deductive closure. But the questions it takes up arise in pretty much the same way when theories are represented in accordance with this semantic account.

## 2. What do observation reports describe?

One answer to this question assumes that observation is a perceptual process so that to observe is to look at, listen to, touch, taste, or smell something, attending to details of the resulting perceptual experience. Observers may have the good fortune to obtain useful perceptual evidence simply by noticing what’s going on around them, but in many cases they must arrange and manipulate things to produce informative perceptible results. In either case, observation sentences describe perceptions or things perceived.

Observers use magnifying glasses, microscopes, or telescopes to see things that are too small or far away to be seen, or seen clearly enough, without them. Similarly, amplification devices are used to hear faint sounds. But if to observe something is to perceive it, not every use of instruments to augment the senses qualifies as observational. Philosophers agree that you can observe the moons of Jupiter with a telescope, or a heart beat with a stethoscope. But minimalist empiricists like Bas Van Fraassen (1980, 16–17) deny that one can observe things that can be visualized only by using electron (and perhaps even) light microscopes. Many philosophers don’t mind microscopes but find it unnatural to say that high energy physicists observe particles or particle interactions when they look at bubble chamber photographs. Their intuitions come from the plausible assumption that one can observe only what one can see by looking, hear by listening, feel by touching, and so on. Investigators can neither look at (direct their gazes toward and attend to) nor visually experience charged particles moving through a bubble chamber. Instead they can look at and see tracks in the chamber, or in bubble chamber photographs.

The identification of observation and perceptual experience persisted well into the 20th century—so much so that Carl Hempel could characterize the scientific enterprise as an attempt to predict and explain the deliverances of the senses (Hempel 1952, 653). This was to be accomplished by using laws or lawlike generalizations along with descriptions of initial conditions, correspondence rules, and auxiliary hypotheses to derive observation sentences describing the sensory deliverances of interest. Theory testing was treated as a matter of comparing observation sentences describing observations made in natural or laboratory settings to observation sentences that should be true according to the theory to be tested. This makes it imperative to ask what observation sentences report. Even though scientists often record their evidence non-sententially, e.g., in the form of pictures, graphs, and tables of numbers, some of what Hempel says about the meanings of observation sentences applies to non-sentential observational records as well.

According to what Hempel called the phenomenalist account, observation reports describe the observer’s subjective perceptual experiences.

…Such experiential data might be conceived of as being sensations, perceptions, and similar phenomena of immediate experience. (Hempel 1952, 674)

This view is motivated by the assumption that the epistemic value of an observation report depends upon its truth or accuracy, and that with regard to perception, the only thing observers can know with certainty to be true or accurate is how things appear to them. This means that we can’t be confident that observation reports are true or accurate if they describe anything beyond the observer’s own perceptual experience. Presumably one’s confidence in a conclusion should not exceed one’s confidence in one’s best reasons to believe it. For the phenomenalist it follows that reports of subjective experience can provide better reasons to believe claims they support than reports of other kinds of evidence. Furthermore, if C.I. Lewis had been right to think that probabilities cannot be established on the basis of dubitable evidence, (Lewis 1950, 182) observation sentences would have no evidential value unless they report the observer’s subjective experiences.[1]

But given the expressive limitations of the language available for reporting subjective experiences we can’t expect phenomenalistic reports to be precise and unambiguous enough to test theoretical claims whose evaluation requires accurate, fine- grained perceptual discriminations. Worse yet, if experiences are directly available only to those who have them, there is room to doubt whether different people can understand the same observation sentence in the same way. Suppose you had to evaluate a claim on the basis of someone else’s subjective report of how a litmus solution looked to her when she dripped a liquid of unknown acidity into it. How could you decide whether her visual experience was the same as the one you would use her words to report?

Such considerations led Hempel to propose, contrary to the phenomenalists, that observation sentences report ‘directly observable’, ‘intersubjectively ascertainable’ facts about physical objects

…such as the coincidence of the pointer of an instrument with a numbered mark on a dial; a change of color in a test substance or in the skin of a patient; the clicking of an amplifier connected with a Geiger counter; etc. (ibid.)

Observers do sometmes have trouble making fine pointer position and color discriminations but such things are more susceptible to precise, intersubjectively understandable descriptions than subjective experiences. How much precision and what degree of intersubjective agreement are required in any given case depends on what is being tested and how the observation sentence is used to evaluate it. But all things being equal, we can’t expect data whose acceptability depends upon delicate subjective discriminations to be as reliable as data whose acceptability depends upon facts that can be ascertained intersubjectively. And similarly for non-sentential records; a drawing of what the observer takes to be the position of a pointer can be more reliable and easier to assess than a drawing that purports to capture her subjective visual experience of the pointer.

The fact that science is seldom a solitary pursuit suggests that one might be able to use pragmatic considerations to finesse questions about what observation reports express. Scientific claims—especially those with practical and policy applications—are typically used for purposes that are best served by public evaluation. Furthermore the development and application of a scientific theory typically requires collaboration and in many cases is promoted by competition. This, together with the fact that investigators must agree to accept putative evidence before they use it to test a theoretical claim, imposes a pragmatic condition on observation reports: an observation report must be such that investigators can reach agreement relatively quickly and relatively easily about whether it provides good evidence with which to test a theory (Cf. Neurath 1913). Feyerabend took this requirement seriously enough to characterize observation sentences pragmatically in terms of widespread decidability. In order to be an observation sentence, he said, a sentence must be contingently true or false, and such that competent speakers of the relevant language can quickly and unanimously decide whether to accept or reject it on the basis what happens when they look, listen, etc. in the appropriate way under the appropriate observation conditions (Feyerabend 1959, 18ff).

The requirement of quick, easy decidability and general agreement favors Hempel’s account of what observation sentences report over the phenomenalist’s. But one shouldn’t rely on data whose only virtue is widespread acceptance. Presumably the data must possess additional features by virtue of which it can serve as an epistemically trustworthy guide to a theory’s acceptability. If epistemic trustworthiness requires certainty, this requirement favors the phenomenalists. Even if trustworthiness doesn’t require certainty, it is not the same thing as quick and easy decidability. Philosophers need to address the question of how these two requirements can be mutually satisfied.

## 3. Is observation an exclusively perceptual process?

Many of the things scientists investigate do not interact with human perceptual systems as required to produce perceptual experiences of them. The methods investigators use to study such things argue against the idea—however plausible it may once have seemed—that scientists do or should rely exclusively on their perceptual systems to obtain the evidence they need. Thus Feyerabend proposed as a thought experiment that if measuring equipment was rigged up to register the magnitude of a quantity of interest, a theory could be tested just as well against its outputs as against records of human perceptions (Feyerabend 1969, 132–137).

Feyerabend could have made his point with historical examples instead of thought experiments. A century earlier Helmholtz estimated the speed of excitatory impulses traveling through a motor nerve. To initiate impulses whose speed could be estimated, he implanted an electrode into one end of a nerve fiber and ran a current into it from a coil. The other end was attached to a bit of muscle whose contraction signaled the arrival of the impulse. To find out how long it took the impulse to reach the muscle he had to know when the stimulating current reached the nerve. But

[o]ur senses are not capable of directly perceiving an individual moment of time with such small duration…

and so Helmholtz had to resort to what he called ‘artificial methods of observation’ (Olesko and Holmes 1994, 84). This meant arranging things so that current from the coil could deflect a galvanometer needle. Assuming that the magnitude of the deflection is proportional to the duration of current passing from the coil, Helmholtz could use the deflection to estimate the duration he could not see (ibid). This ‘artificial observation’ is not to be confused e.g., with using magnifying glasses or telescopes to see tiny or distant objects. Such devices enable the observer to scrutinize visible objects. The miniscule duration of the current flow is not a visible object. Helmholtz studied it by looking at and seeing something else. (Hooke (1705, 16–17) argued for and designed instruments to execute the same kind of strategy in the 17th century.) The moral of Feyerabend’s thought experiment and Helmholtz’s distinction between perception and artificial observation is that working scientists are happy to call things that register on their experimental equipment observables even if they don’t or can’t register on their senses.

Some evidence is produced by processes so convoluted that it’s hard to decide what, if anything has been observed. Consider functional magnetic resonance images (fMRI) of the brain decorated with colors to indicate magnitudes of electrical activity in different regions during the performance of a cognitive task. To produce these images, brief magnetic pulses are applied to the subject’s brain. The magnetic force coordinates the precessions of protons in hemoglobin and other bodily stuffs to make them emit radio signals strong enough for the equipment to respond to. When the magnetic force is relaxed, the signals from protons in highly oxygenated hemoglobin deteriorate at a detectably different rate than signals from blood that carries less oxygen. Elaborate algorithms are applied to radio signal records to estimate blood oxygen levels at the places from which the signals are calculated to have originated. There is good reason to believe that blood flowing just downstream from spiking neurons carries appreciably more oxygen than blood in the vicinity of resting neurons. Assumptions about the relevant spatial and temporal relations are used to estimate levels of electrical activity in small regions of the brain corresponding to pixels in the finished image. The results of all of these computations are used to assign the appropriate colors to pixels in a computer generated image of the brain. The role of the senses in fMRI data production is limited to such things as monitoring the equipment and keeping an eye on the subject. Their epistemic role is limited to discriminating the colors in the finished image, reading tables of numbers the computer used to assign them, and so on.

If fMRI images record observations, it’s hard to say what was observed—neuronal activity, blood oxygen levels, proton precessions, radio signals, or something else. (If anything is observed, the radio signals that interact directly with the equipment would seem to be better candidates than blood oxygen levels or neuronal activity.) Furthermore, it’s hard to reconcile the idea that fMRI images record observations with the traditional empiricist notion that much as they may be needed to draw conclusions from observational evidence, calculations involving theoretical assumptions and background beliefs must not be allowed (on pain of loss of objectively) to intrude into the process of data production. The production of fMRI images requires extensive statistical manipulation based on theories about the radio signals, and a variety of factors having to do with their detection along with beliefs about relations between blood oxygen levels and neuronal activity, sources of systematic error, and so on.

In view of all of this, functional brain imaging differs, e.g., from looking and seeing, photographing, and measuring with a thermometer or a galvanometer in ways that make it uninformative to call it observation at all. And similarly for many other methods scientists use to produce non-perceptual evidence.

Terms like ‘observation’ and ‘observation reports’ don’t occur nearly as much in scientific as in philosophical writings. In their place, working scientists tend to talk about data. Philosophers who adopt this usage are free to think about standard examples of observation as members of a large, diverse, and growing family of data production methods. Instead of trying to decide which methods to classify as observational and which things qualify as observables, philosophers can then concentrate on the epistemic influence of the factors that differentiate members of the family. In particular, they can focus their attention on what questions data produced by a given method can be used to answer, what must be done to use that data fruitfully, and the credibility of the answers they afford.(Bogen 2016)

It is of interest that records of perceptual observation are not always epistemically superior to data from experimental equipment. Indeed it is not unusual for investigators to use non-perceptual evidence to evaluate perceptual data and correct for its errors. For example, Rutherford and Pettersson conducted similar experiments to find out if certain elements disintegrated to emit charged particles under radioactive bombardment. To detect emissions, observers watched a scintillation screen for faint flashes produced by particle strikes. Pettersson’s assistants reported seeing flashes from silicon and certain other elements. Rutherford’s did not. Rutherford’s colleague, James Chadwick, visited Petterson’s laboratory to evaluate his data. Instead of watching the screen and checking Pettersson’s data against what he saw, Chadwick arranged to have Pettersson’s assistants watch the screen while unbeknownst to them he manipulated the equipment, alternating normal operating conditions with a condition in which particles, if any, could not hit the screen. Pettersson’s data were discredited by the fact that his assistants reported flashes at close to the same rate in both conditions (Steuwer 1985, 284–288).

Related considerations apply to the distinction between observable and unobservable objects of investigation. Some data are produced to help answer questions about things that do not themselves register on the senses or experimental equipment. Solar neutrino fluxes are a frequently discussed case in point. Neutrinos cannot interact directly with the senses or measuring equipment to produce recordable effects. Fluxes in their emission were studied by trapping the neutrinos and allowing them to interact with chlorine to produce a radioactive argon isotope. Experimentalists could then calculate fluxes in solar neutrino emission from Geiger counter measurements of radiation from the isotope. The epistemic significance of the neutrinos’ unobservability depends upon factors having to do with the reliability of the data the investigators managed to produce, and its validity as a source of information about the fluxes. It’s validity will depend, among many other things, on the correctness of the investigators ideas about how neutrinos interact with chlorine (Pinch 1985). But there are also unobservables that cannot be detected, and whose features cannot be inferred from data of any kind. These are the only unobservables that are epistemically unavailable. Whether they remain so depends upon whether scientists can figure out how to produce data to study them. The history of particle physics (see e.g. Morrison 2015) and neuro-science (see e.g., Valenstein 2005).

## 4. How observational evidence might be theory laden

Empirically minded philosophers assume that the evidential value of an observation or observational process depends on how sensitive it is to whatever it is used to study. But this in turn depends on the adequacy of any theoretical claims its sensitivity may depend on. For example we can challenge the use of a thermometer reading, e, to support a description, prediction, or explanation of a patient’s temperature, t, by challenging theoretical claims, C, having to do with whether a reading from a thermometer like this one, applied in the same way under similar conditions, should indicate the patient’s temperature well enough to count in favor of or against t. At least some of the Cwill be such that regardless of whether an investigator explicitly endorses, or is even aware of them, her use of e would be undermined by their falsity. All observations and uses of observations evidence are theory laden in this sense. (Cf. Chang 2005), Azzouni 2004.) As the example of the thermometer illustrates, analogues of Norwood Hanson’claim that seeing is a theory laden undertaking apply just as well to equipment generated observations.(Hanson 1958, 19). But if all observations and observational processes are theory laden, how can they provide reality based, objective epistemic constraints on scientific reasoning? One thing to say about this is that the theoretical claims the epistemic value of a parcel of observational evidence depends on may be may be quite correct. If so, even if we don’t know, or have no way to establish their correctness, the evidence may be good enough for the uses to which we put it. But this is cold comfort for investigators who can’t establish it. The next thing to say is that scientific investigation is an ongoing process during the course of which theoretical claims whose unacceptability would reduce the epistemic value of a parcel of evidence can be challenged and defended in different ways at different times as new considerations and investigative techniques are introduced. We can hope that the acceptability of the evidence can be established relative to one or more stretches of time even though success in dealing with challenges at one time is no guarantee that all future challenges can be satisfactorily dealt with. Thus as long as scientists continue their work there need be no time at which the epistemic value of of a parcel of evidence can be established once and for all. This should come as no surprise to anyone who is aware that science is fallible. But it is no grounds for skepticism. It can be perfectly reasonable to trust the evidence available at present even though it is logically possible for epistemic troubles to arise in the future.

Thomas Kuhn (1962), Norwood Hanson (1958), Paul Feyerabend (1959) and others cast suspicion on the objectivity of observational evidence in another way by arguing that one can’t use empirical evidence to teat a theory without committing oneself to that very theory. Although some of the examples they use to present their case feature equipment generated evidence, they tend to talk about observation as a perceptual process. Kuhn’s writings contain three different versions of this idea.

K1. Perceptual Theory Loading. Perceptual psychologists, Bruner and Postman, found that subjects who were briefly shown anomalous playing cards, e.g., a black four of hearts, reported having seen their normal counterparts e.g., a red four of hearts. It took repeated exposures to get subjects to say the anomalous cards didn’t look right, and eventually, to describe them correctly. (Kuhn 1962, 63). Kuhn took such studies to indicate that things don’t look the same to observers with different conceptual resources. (For a more up-to-date discussion of theory and conceptual perceptual loading see Lupyan 2015.) If so, black hearts didn’t look like black hearts until repeated exposures somehow allowed subjects to acquire the concept of a black heart. By analogy, Kuhn supposed, when observers working in conflicting paradigms look at the same thing, their conceptual limitations should keep them from having the same visual experiences (Kuhn 1962, 111, 113–114, 115, 120–1). This would mean, for example, that when Priestley and Lavoisier watched the same experiment, Lavioisier should have seen what accorded with his theory that combustion and respiration are oxidation processes, while Priestley’s visual experiences should have agreed with his theory that burning and respiration are processes of phlogiston release.

K2. Semantical Theory Loading: Kuhn argued that theoretical commitments exert a strong influence on observation descriptions, and what they are understood to mean (Kuhn 1962, 127ff, Longino 1979,38-42). If so, proponents of a caloric account of heat won’t describe or understand descriptions of observed results of heat experiments in the same way as investigators who think of heat in terms of mean kinetic energy or radiation. They might all use the same words (e.g., ‘temperature’) to report an observation without understanding them in the same way.

K3. Salience: Kuhn claimed that if Galileo and an Aristotelian physicist had watched the same pendulum experiment, they would not have looked at or attended to the same things. The Aristotelian’s paradigm would have required the experimenter to measure

…the weight of the stone, the vertical height to which it had been raised, and the time required for it to achieve rest (Kuhn 1992, 123)

and ignore radius, angular displacement, and time per swing (Kuhn 1962, 124).

These last were salient to Galileo because he treated pendulum swings as constrained circular motions. The Galilean quantities would be of no interest to an Aristotelian who treats the stone as falling under constraint toward the center of the earth (Kuhn 1962, 123). Thus Galileo and the Aristotelian would not have collected the same data. (Absent records of Aristotelian pendulum experiments we can think of this as a thought experiment.)

## 5. Salience and theoretical stance

Taking K1, K2, and K3 in order of plausibility, K3 points to an important fact about scientific practice. Data production (including experimental design and execution) is heavily influenced by investigators’ background assumptions. Sometimes these include theoretical commitments that lead experimentalists to produce non-illuminating or misleading evidence. In other cases they may lead experimentalists to ignore, or even fail to produce useful evidence. For example, in order to obtain data on orgasms in female stumptail macaques, one researcher wired up females to produce radio records of orgasmic muscle contractions, heart rate increases, etc. But as Elisabeth Lloyd reports, “… the researcher … wired up the heart rate of the male macaques as the signal to start recording the female orgasms. When I pointed out that the vast majority of female stumptail orgasms occurred during sex among the females alone, he replied that yes he knew that, but he was only interested in important orgasms” (Lloyd 1993, 142). Although female stumptail orgasms occuring during sex with males are atypical, the experimental design was driven by the assumption that what makes features of female sexuality worth studying is their contribution to reproduction (Lloyd 1993, 139).

Fortunately, such things don’t always happen. When they do, investigators are often able eventually to make corrections, and come to appreciate the significance of data that had not originally been salient to them. Thus paradigms and theoretical commitments actually do influence saliency, but their influence is neither inevitable nor irremediable.

With regard to semantic theory loading (K2), it’s important to bear in mind that observers don’t always use declarative sentences to report observational and experimental results. They often draw, photograph, make audio recordings, etc. instead or set up their experimental devices to generate graphs, pictorial images, tables of numbers, and other non-sentential records. Obviously investigators’ conceptual resources and theoretical biases can exert epistemically significant influences on what they record (or set their equipment to record), which details they include or emphasize, and which forms of representation they choose (Daston and Galison 2007,115–190 309–361). But disagreements about the epistemic import of a graph, picture or other non-sentential bit of data often turn on causal rather than semantical considerations. Anatomists may have to decide whether a dark spot in a micrograph was caused by a staining artifact or by light reflected from an anatomically significant structure. Physicists may wonder whether a blip in a Geiger counter record reflects the causal influence of the radiation they wanted to monitor, or a surge in ambient radiation. Chemists may worry about the purity of samples used to obtain data. Such questions are not, and are not well represented as, semantic questions to which K2 is relevant. Late 20th century philosophers may have ignored such cases and exaggerated the influence of semantic theory loading because they thought of theory testing in terms of inferential relations between observation and theoretical sentences.

With regard to sentential observation reports, the significance of semantic theory loading is less ubiquitous than one might expect. The interpretation of verbal reports often depends on ideas about causal structure rather than the meanings of signs. Rather than worrying about the meaning of words used to describe their observations, scientists are more likely to wonder whether the observers made up or withheld information, whether one or more details were artifacts of observation conditions, whether the specimens were atypical, and so on.

Kuhnian paradigms are heterogeneous collections of experimental practices, theoretical principles, problems selected for investigation, approaches to their solution, etc. Connections between components are loose enough to allow investigators who disagree profoundly over one or more theoretical claims to agree about how to design, execute, and record the results of their experiments. That is why neuroscientists who disagreed about whether nerve impulses consisted of electrical currents could measure the same electrical quantities, and agree on the linguistic meaning and the accuracy of observation reports including such terms as ‘potential’, ‘resistance’, ‘voltage’ and ‘current’.

## 7. Operationalization and observation reports

The issues this section touches on are distant, linguistic descendents of issues that arose in connection with Locke’s view that mundane and scientific concepts (the empiricists called them ideas) derive their contents from experience (Locke 1700, 104–121,162–164, 404–408).

Looking at a patient with red spots and a fever, an investigator might report having seen the spots, or measles symptoms, or a patient with measles. Watching an unknown liquid dripping into a litmus solution an observer might report seeing a change in color, a liquid with a PH of less than 7, or an acid. The appropriateness of a description of a test outcome depends on how the relevant concepts are operationalized. What justifies an observer to report having observed a case of measles according to one operationalization might require her to say no more than that she had observed measles symptoms, or just red spots according to another.

In keeping with Percy Bridgman’s view that

…in general, we mean by a concept nothing more than a set of operations; the concept is synonymous with the corresponding sets of operations. (Bridgman 1927, 5)

one might suppose that operationalizations are definitions or meaning rules such that it is analytically true, e.g., that every liquid that turns litmus red in a properly conducted test is acidic. But it is more faithful to actual scientific practice to think of operationalizations as defeasible rules for the application of a concept such that both the rules and their applications are subject to revision on the basis of new empirical or theoretical developments. So understood, to operationalize is to adopt verbal and related practices for the purpose of enabling scientists to do their work. Operationalizations are thus sensitive and subject to change on the basis of findings that influence their usefulness (Feest, 2005).

Definitional or not, investigators in different research traditions may be trained to report their observations in conformity with conflicting operationalizations. Thus instead of training observers to describe what they see in a bubble chamber as a whitish streak or a trail, one might train them to say they see a particle track or even a particle. This may reflect what Kuhn meant by suggesting that some observers might be justified or even required to describe themselves as having seen oxygen, transparent and colorless though it is, or atoms, invisible though they are. (Kuhn 1962, 127ff) To the contrary, one might object that what one sees should not be confused with what one is trained to say when one sees it, and therefore that talking about seeing a colorless gas or an invisible particle may be nothing more than a picturesque way of talking about what certain operationalizations entitle observers to say. Strictly speaking, the objection concludes, the term ‘observation report’ should be reserved for descriptionsthat are neutral with respect to conflicting operationalizations.

If observational data are just those utterances that meet Feyerabend’s decidability and agreeability conditions, the import of semantic theory loading depends upon how quickly, and for which sentences reasonably sophisticated language users who stand in different paradigms can non-inferentially reach the same decisions about what to assert or deny. Some would expect enough agreement to secure the objectivity of observational data. Others would not. Still others would try to supply different standards for objectivity.

## 8. Is perception theory laden?

The example of Pettersson’s and Rutherford’s scintillation screen evidence (above) attests to the fact that observers working in different laboratories sometimes report seeing different things under similar conditions. It’s plausible that their expectations influence their reports. It’s plausible that their expectations are shaped by their training and by their supervisors’ and associates’ theory driven behavior. But as happens in other cases as well, all parties to the dispute agreed to reject Pettersson’s data by appeal to results of mechanical manipulations both laboratories could obtain and interpret in the same way without compromising their theoretical commitments.

Furthermore proponents of incompatible theories often produce impressively similar observational data. Much as they disagreed about the nature of respiration and combustion, Priestley and Lavoisier gave quantitatively similar reports of how long their mice stayed alive and their candles kept burning in closed bell jars. Priestley taught Lavoisier how to obtain what he took to be measurements of the phlogiston content of an unknown gas. A sample of the gas to be tested is run into a graduated tube filled with water and inverted over a water bath. After noting the height of the water remaining in the tube, the observer adds “nitrous air” (we call it nitric oxide) and checks the water level again. Priestley, who thought there was no such thing as oxygen, believed the change in water level indicated how much phlogiston the gas contained. Lavoisier reported observing the same water levels as Priestley even after he abandoned phlogiston theory and became convinced that changes in water level indicated free oxygen content (Conant 1957, 74–109).

The moral of these examples is that although paradigms or theoretical commitments sometimes have an epistemically significant influence on what observers perceive, it can be relatively easy to nullify or correct for their effects.

## 9. How do observational data bear on the acceptability of theoretical claims?

Typical responses to this question maintain that the acceptability of theoretical claims depends upon whether they are true (approximately true, probable, or significantly more probable than their competitors) or whether they “save” observable phenomena. They then try to explain how observational data argue for or against the possession of one or more of these virtues.

Truth. It’s natural to think that computability, range of application, and other things being equal, true theories are better than false ones, good approximations are better than bad ones, and highly probable theoretical claims are better than less probable ones. One way to decide whether a theory or a theoretical claim is true, close to the truth, or acceptably probable is to derive predictions from it and use observational data to evaluate them. Hypothetico-Deductive (HD) confirmation theorists propose that observational evidence argues for the truth of theories whose deductive consequences it verifies, and against those whose consequences it falsifies (Popper 1959, 32–34). But laws and theoretical generalization seldom if ever entail observational predictions unless they are conjoined with one or more auxiliary hypotheses taken from the theory they belong to. When the prediction turns to be false, HD has trouble explaining which of the conjuncts is to blame. If a theory entails a true prediction, it will continue to do so in conjunction with arbitrarily selected irrelevant claims. HD has trouble explaining why the prediction doesn’t confirm the irrelevancies along with the theory of interest.

Ignoring details, large and small, bootstrapping confirmation theories hold that an observation report confirms a theoretical generalization if an instance of the generalization follows from the observation report conjoined with auxiliary hypotheses from the theory the generalization belongs to. Observation counts against a theoretical claim if the conjunction entails a counter-instance. Here, as with HD, an observation argues for or against a theoretical claim only on the assumption that the auxiliary hypotheses are true (Glymour 1980, 110–175).

Bayesians hold that the evidential bearing of observational evidence on a theoretical claim is to be understood in terms of likelihood or conditional probability. For example, whether observational evidence argues for a theoretical claim might be thought to depend upon whether it is more probable (and if so how much more probable) than its denial conditional on a description of the evidence together with background beliefs, including theoretical commitments. But by Bayes’ theorem, the conditional probability of the claim of interest will depend in part upon that claim’s prior probability. Once again, one’s use of evidence to evaluate a theory depends in part upon one’s theoretical commitments. (Earman 1992, 33–86. Roush 2005, 149–186)

Francis Bacon (Bacon 1620, 70) said that allowing one’s commitment to a theory to determine what one takes to be the epistemic bearing of observational evidence on that very theory is, if anything, even worse than ignoring the evidence altogether. HD, Bootstrap, Bayesian, and related accounts of conformation run the risk of earning Bacon’s disapproval. According to all of them it can be reasonable for adherents of competing theories to disagree about how observational data bear on the same claims. As a matter of historical fact, such disagreements do occur. The moral of this fact depends upon whether and how such disagreements can be resolved. Because some of the components of a theory are logically and more or less probabilistically independent of one another, adherents of competing theories can often can find ways to bring themselves into close enough agreement about auxiliary hypotheses or prior probabilities to draw the same conclusions from the evidence.

Saving observable phenomena. Theories are said to save observable phenomena if they satisfactorily predict, describe, or systematize them. How well a theory performs any of these tasks need not depend upon the truth or accuracy of its basic principles. Thus according to Osiander’s preface to Copernicus’ On the Revolutions, a locus classicus, astronomers ‘…cannot in any way attain to true causes’ of the regularities among observable astronomical events, and must content themselves with saving the phenomena in the sense of using

…whatever suppositions enable …[them] to be computed correctly from the principles of geometry for the future as well as the past…(Osiander 1543, XX)

Theorists are to use those assumptions as calculating tools without committing themselves to their truth. In particular, the assumption that the planets rotate around the sun must be evaluated solely in terms of how useful it is in calculating their observable relative positions to a satisfactory approximation.

Pierre Duhem’s Aim and Structure of Physical Theory articulates a related conception. For Duhem a physical theory

…is a system of mathematical propositions, deduced from a small number of principles, which aim to represent as simply and completely, and exactly as possible, a set of experimental laws. (Duhem 1906, 19)

‘Experimental laws’ are general, mathematical descriptions of observable experimental results. Investigators produce them by performing measuring and other experimental operations and assigning symbols to perceptible results according to pre-established operational definitions (Duhem 1906, 19). For Duhem, the main function of a physical theory is to help us store and retrieve information about observables we would not otherwise be able to keep track of. If that’s what a theory is supposed to accomplish, its main virtue should be intellectual economy. Theorists are to replace reports of individual observations with experimental laws and devise higher level laws (the fewer, the better) from which experimental laws (the more, the better) can be mathematically derived (Duhem 1906, 21ff).

A theory’s experimental laws can be tested for accuracy and comprehensiveness by comparing them to observational data. Let EL be one or more experimental laws that perform acceptably well on such tests. Higher level laws can then be evaluated on the basis of how well they integrate EL into the rest of the theory. Some data that don’t fit integrated experimental laws won’t be interesting enough to worry about. Other data may need to be accommodated by replacing or modifying one or more experimental laws or adding new ones. If the required additions, modifications or replacements deliver experimental laws that are harder to integrate, the data count against the theory. If the required changes are conducive to improved systematization the data count in favor of it. If the required changes make no difference, the data don’t argue for or against the theory.

## 10. Data and phenomena

It is an unwelcome fact for all of these ideas about theory testing that data are typically produced in ways that make it impossible to predict them from the generalizations they are used to test, or to derive instances of those generalizations from data and non ad hoc auxiliary hypotheses. Indeed, it’s unusual for many members of a set of reasonably precise quantitative data to agree with one another, let alone with a quantitative prediction. That is because precise, publicly accessible data typically cannot be produced except through processes whose results reflect the influence of causal factors that are too numerous, too different in kind, and too irregular in behavior for any single theory to account for them. When Bernard Katz recorded electrical activity in nerve fiber preparations, the numerical values of his data were influenced by factors peculiar to the operation of his galvanometers and other pieces of equipment, variations among the positions of the stimulating and recording electrodes that had to be inserted into the nerve, the physiological effects of their insertion, and changes in the condition of the nerve as it deteriorated during the course of the experiment. There were variations in the investigators’ handling of the equipment. Vibrations shook the equipment in response to a variety of irregularly occurring causes ranging from random error sources to the heavy tread of Katz’s teacher, A.V. Hill, walking up and down the stairs outside of the laboratory. That’s a short list. To make matters worse, many of these factors influenced the data as parts of irregularly occurring, transient, and shifting assemblies of causal influences.

With regard to kinds of data that should be of interest to philosophers of physics, consider how many extraneous causes influenced radiation data in solar neutrino detection experiments, or spark chamber photographs produced to detect particle interactions. The effects of systematic and random sources of error are typically such that considerable analysis and interpretation are required to take investigators from data sets to conclusions that can be used to evaluate theoretical claims.

This applies as much to clear cases of perceptual data as to machine produced records. When 19th and early 20th century astronomers looked through telescopes and pushed buttons to record the time at which they saw a moon pass a crosshair, the values of their data points depended, not only upon light reflected from the moon, but also upon features of perceptual processes, reaction times, and other psychological factors that varied non-systematically from time to time and observer to observer. No astronomical theory has the resources to take such things into account. Similar considerations apply to the probabilities of specific data points conditional on theoretical principles, and the probabilities of confirming or disconfirming instances of theoretical claims conditional on the values of specific data points.

Instead of testing theoretical claims by direct comparison to raw data, investigators use data to infer facts about phenomena, i.e., events, regularities, processes, etc. whose instances, are uniform and uncomplicated enough to make them susceptible to systematic prediction and explanation (Bogen and Woodward 1988, 317). The fact that lead melts at temperatures at or close to 327.5 C is an example of a phenomenon, as are widespread regularities among electrical quantities involved in the action potential, the periods and orbital paths of the planets, etc. Theories that cannot be expected to predict or explain such things as individual temperature readings can nevertheless be evaluated on the basis of how useful they they are in predicting or explaining phenomena they are used to detect. The same holds for the action potential as opposed to the electrical data from which its features are calculated, and the orbits of the planets in contrast to the data of positional astronomy. It’s reasonable to ask a genetic theory how probable it is (given similar upbringings in similar environments) that the offspring of a schizophrenic parent or parents will develop one or more symptoms the DSM classifies as indicative of schizophrenia. But it would be quite unreasonable to ask it to predict or explain one patient’s numerical score on one trial of a particular diagnostic test, or why a diagnostician wrote a particular entry in her report of an interview with an offspring of a schizophrenic parents (Bogen and Woodward, 1988, 319–326).

The fact that theories are better at predicting and explaining facts about or features of phenomena than data isn’t such a bad thing. For many purposes, theories that predict and explain phenomena would be more illuminating, and more useful for practical purposes than theories (if there were any) that predicted or explained members of a data set. Suppose you could choose between a theory that predicted or explained the way in which neurotransmitter release relates to neuronal spiking (e.g., the fact that on average, transmitters are released roughly once for every 10 spikes) and a theory which explained or predicted the numbers displayed on the relevant experimental equipment in one, or a few single cases. For most purposes, the former theory would be preferable to the latter at the very least because it applies to so many more cases. And similarly for theories that predict or explain something about the probability of schizophrenia conditional on some genetic factor or a theory that predicted or explained the probability of faulty diagnoses of schizophrenia conditional on facts about the psychiatrist’s training. For most purposes, these would be preferable to a theory that predicted specific descriptions in a case history.

In view of all of this, together with the fact that a great many theoretical claims can only be tested directly against facts about phenomena, it behooves epistemologists to think about how data are used to answer questions about phenomena. Lacking space for a detailed discussion, the most this entry can do is to mention two main kinds of things investigators do in order to draw conclusions from data. The first is causal analysis carried out with or without the use of statistical techniques. The second is non-causal statistical analysis.

First, investigators must distinguish features of the data that are indicative of facts about the phenomenon of interest from those which can safely be ignored, and those which must be corrected for. Sometimes background knowledge makes this easy. Under normal circumstances investigators know that their thermometers are sensitive to temperature, and their pressure gauges, to pressure. An astronomer or a chemist who knows what spectrographic equipment does, and what she has applied it to will know what her data indicate. Sometimes it’s less obvious. When Ramon y Cajal looked through his microscope at a thin slice of stained nerve tissue, he had to figure out which if any of the fibers he could see at one focal length connected to or extended from things he could see only at another focal length, or in another slice.

Analogous considerations apply to quantitative data. It was easy for Katz to tell when his equipment was responding more to Hill’s footfalls on the stairs than to the electrical quantities is was set up to measure. It can be harder to tell whether an abrupt jump in the amplitude of a high frequency EEG oscillation was due to a feature of the subjects brain activity or an artifact of extraneous electrical activity in the laboratory or operating room where the measurements were made. The answers to questions about which features of numerical and non-numerical data are indicative of a phenomenon of interest typically depend at least in part on what is known about the causes that conspire to produce the data.

Statistical arguments are often used to deal with questions about the influence of epistemically relevant causal factors. For example, when it is known that similar data can be produced by factors that have nothing to do with the phenomenon of interest, Monte Carlo simulations, regression analyses of sample data, and a variety of other statistical techniques sometimes provide investigators with their best chance of deciding how seriously to take a putatively illuminating feature of their data.

But statistical techniques are also required for purposes other than causal analysis. To calculate the magnitude of a quantity like the melting point of lead from a scatter of numerical data, investigators throw out outliers, calculate the mean and the standard deviation, etc., and establish confidence and significance levels. Regression and other techniques are applied to the results to estimate how far from the mean the magnitude of interest can be expected to fall in the population of interest (e.g., the range of temperatures at which pure samples of lead can be expected to melt).

The fact that little can be learned from data without causal, statistical, and related argumentation has interesting consequences for received ideas about how the use of observational evidence distinguishes science from pseudo science, religion, and other non-scientific cognitive endeavors.First, scientists aren’t the only ones who use observational evidence to support their claims; astrologers and medical quacks use them too. To find epistemically significant differences, one must carefully consider what sorts of data they use, where it comes from, and how it is employed. The virtues of scientific as opposed to non-scientific theory evaluations depend not only on its reliance on empirical data, but also on how the data are produced, analyzed and interpreted to draw conclusions against which theories can be evaluated. Secondly, it doesn’t take many examples to refute the notion that adherence to a single, universally applicable “scientific method” differentiates the sciences from the non-sciences. Data are produced, and used in far too many different ways to treat informatively as instance of any single method. Thirdly, it is usually, if not always, impossible for investigators to draw conclusions to test theories against observational data without explicit or implicit reliance on theoretical principles. This means that when counterparts to Kuhnian questions about theory loading and its epistemic significance arise in connection with the analysis and interpretation of observational evidence, such questions must be answered by appeal to details that vary from case to case.

## 11. Conclusion

Grammatical variants of the term ‘observation’ have been applied to impressively different perceptual and non-perceptual process and to records of the results they produce. Their diversity is a reason to doubt whether general philosophical accounts of observation, observables, and observational data can tell epistemologists as much as local accounts grounded in close studies of specific kinds of cases. Furthermore, scientists continue to find ways to produce data that can’t be called observational without stretching the term to the point of vagueness.

It’s plausible that philosophers who value the kind of rigor, precision, and generality to which l logical empiricists and other exact philosophers aspired could do better by examining and developing techniques and results from logic, probability theory, statistics, machine learning, and computer modeling, etc. than by trying to construct highly general theories of observation and its role in science. Logic and the rest seem unable to deliver satisfactory, universally applicable accounts of scientific reasoning. But they have illuminating local applications, some of which can be of use to scientists as well as philosophers.

## Bibliography

• Aristotle(a), Generation of Animals in Complete Works of Aristotle (Volume 1), J. Barnes (ed.), Princeton: Princeton University Press, 1995, pp. 774–993
• Aristotle(b), History of Animals in Complete Works of Aristotle (Volume 1), J. Barnes (ed.), Princeton: Princeton University Press, 1995, pp. 1111–1228.
• Azzouni, J., 2004, “Theory, Observation, and Scientific Realism,” British Journal for the Philosophy of Science, 55(3): 371-92.
• Bacon, Francis, 1620, Novum Organum with other parts of the Great Instauration, P. Urbach and J. Gibson (eds. and trans.), La Salle: Open Court, 1994.
• Bogen, J., 2016, “Empiricism and After,”in P. Humphreys (ed.), Oxford Handbook of Philosophy of Science, Oxford: Oxford Univesity Press, 779-795.
• Bogen, J, and Woodward, J., 1988, “Saving the Phenomena,” Philosophical Review, XCVII (3): 303–352.
• Boyle, R., 1661, The Sceptical Chymist, Montana: Kessinger (reprint of 1661 edition).
• Bridgman, P., 1927, The Logic of Modern Physics, New York: Macmillan.
• Chang, H., 2005, “A Case for Old-fashioned Observability, and a Reconstructive Empiricism,” Philosophy of Science, 72(5): 876–887.
• Collins, H. M., 1985 Changing Order, Chicago: University of Chicago Press.
• Conant, J.B., 1957, (ed.) “The Overthrow of the Phlogiston Theory: The Chemical Revolution of 1775–1789,” in J.B.Conant and L.K. Nash (eds.), Harvard Studies in Experimental Science, Volume I, Cambridge: Harvard University Press, pp. 65–116).
• Duhem, P., 1906, The Aim and Structure of Physical Theory, P. Wiener (tr.), Princeton: Princeton University Press, 1991.
• Earman, J., 1992, Bayes or Bust?, Cambridge: MIT Press.
• Feest, U., 2005, “Operationism in psychology: what the debate is about, what the debate should be about,” Journal of the History of the Behavioral Sciences, 41(2): 131–149.
• Feyerabend, P.K., 1959, “An Attempt at a Realistic Interpretation of Expeience,” in P.K. Feyerabend, Realism, Rationalism, and Scientific Method (Philosophical Papers I), Cambridge: Cambridge University Press, 1985, pp. 17–36.
• Feyerabend, P.K., 1969, “Science Without Experience,” in P.K. Feyerabend, Realism, Rationalism, and Scientific Method (Philosophical Papers I), Cambridge: Cambridge University Press, 1985, pp. 132–136.
• Franklin, A., 1986, The Neglect of Experiment, Cambridge: Cambridge University Press.
• Galison, P., 1987, How Experiments End, Chicago: University of Chicago Press.
• Galison, P., 1990, “Aufbau/Bauhaus: logical positivism and architectural modernism,” Critical Inquiry, 16 (4): 709–753.
• Galison, P., and Daston, L., 2007, Objectivity, Brooklyn: Zone Books.
• Glymour, C., 1980, Theory and Evidence, Princeton: Princeton University Press.
• Hacking, I, 1983, Representing and Intervening, Cambridge: Cambridge University Press.
• Hanson, N.R., 1958, Patterns of Discovery, Cambridge, Cambridge University Press.
• Hempel, C.G., 1935, “On the Logical Positivists’ Theory of Truth,” Analysis, 2 (4): 50–59.
• Hempel, C.G., 1952, “Fundamentals of Concept Formation in Empirical Science,” in Foundations of the Unity of Science, Volume 2, O. Neurath, R. Carnap, C. Morris (eds.), Chicago: University of Chicago Press, 1970, pp. 651–746.
• Herschel, J. F. W., 1830, Preliminary Discourse on the Study of Natural Philosophy, New York: Johnson Reprint Corp., 1966.
• Hooke, R., 1705, “The Method of Improving Natural Philosophy,” in R. Waller (ed.), The Posthumous Works of Robert Hooke, London: Frank Cass and Company, 1971.
• Jeffrey, R.C., 1983, The Logic of Decision, Chicago: University Press.
• Kuhn, T.S., The Structure of Scientific Revolutions, 1962, Chicago: University of Chicago Press, reprinted,1996.
• Latour, B., and Woolgar, S., 1979, Laboratory Life, The Construction of Scientific Facts, Princeton: Princeton University Press, 1986.
• Lewis, C.I., 1950, Analysis of Knowledge and Valuation, La Salle: Open Court.
• Lloyd, E.A., 1993, “Pre-theoretical Assumptions In Evolutionary Explanations of Female Sexuality,”, Philosophical Studies, 69: 139–153.
• Lupyan, G., 2015, “Cognitive Penetrability of Perception in the Age of Prediction – Predictive Systems are Penetrable Systems,” Review of Philosophical Psychology, 6(4): 547–569. doi:10.1007/s13164-015-0253-4
• Longino, H., 1979, “Evidence and Hypothesis: An Analysis of Evidential Relations,” Philosophy of Science, 46(1): 35-56.
• Morrison, M., 2015, Reconstructing Reality, New York: Oxford University Press.
• Neurath, O., 1913, “The Lost Wanderers of Descartes and the Auxilliary Motive,” in O. Neurath, Philosophical Papers, Dordrecht: D. Reidel, 1983, pp. 1–12.
• Olesko, K.M. and Holmes, F.L., 1994, “Experiment, Quantification and Discovery: Helmholtz’s Early Physiological Researches, 1843–50,” in D. Cahan, (ed.), Hermann Helmholtz and the Foundations of Nineteenth Century Science, Berkeley: UC Press, pp. 50–108)
• Osiander, A., 1543, “To the Reader Concerning the Hypothesis of this Work,” in N. Copernicus On the Revolutions, E. Rosen (tr., ed.), Baltimore: Johns Hopkins University Press, 1978, p. XX.
• Pearl, J., 2000, Causality, Cambridge: Cambridge University Press.
• Pinch, T., 1985, “Towards an Analysis of Scientific Observation: The Externality and Evidential Significance of Observation Reports in Physics,” in Social Studies of Science, 15, pp. 3–36.
• Popper, K.R.,1959, The Logic of Scientific Discovery, K.R. Popper (tr.), New York: Basic Books.
• Rheinberger, H. J., 1997, Towards a History of Epistemic Things: Synthesizing Proteins in the Test Tube, Stanford: Stanford University Press.
• Roush, S., 2005, Tracking Truth, Cambridge: Cambridge University Press.
• Schlick, M., 1935, “Facts and Propositions,” in Philosophy and Analysis, M. Macdonald (ed.), New York: Philosophical Library, 1954, pp. 232–236.
• Spirtes, C., Glymour, C., and Scheines, R., 2000, Causation, Prediction, and Search, Cambridge: MIT Press.
• Steuer, R.H., “Artificial Distintegration and the Cambridge-Vienna Controversy,” in P. Achinstein and O. Hannaway (eds.), Observation, Experiment, and Hypothesis in Modern Physical Science, Cambridge: MIT Press, 1985, 239–307)
• Suppe, F., 1977, in F. Suppe (ed.)The Structure of Scientific Theories, Urbana: University of Illinois Press.
• Valenstein, E.S., 2005, The War of the Soups and the Sparks, New York: Columbia University Press.
• Van Fraassen, B.C, 1980, The Scientific Image, Oxford: Clarendon Press.
• Whewell, W., 1858, Novum Organon Renovatum, Book II, in William Whewell Theory of Scientfic Method, R.E. Butts (ed.), Indianapolis: Hackett Publishing Company, 1989, pp. 103–249.