Two

On the road to Planck 1900

“We consider, however – this is the most essential point of the whole calculation – [the energy] E to be composed of a very definite number of equal parts and use thereto the constant of nature h = 6.55 × 10−27 erg-sec. This constant multiplied by the frequency ν … gives us the energy element, ε”.

Max Planck, lecture of December 14, 19001

Introduction

At least by 1850, nearly all physicists had abandoned the theory of Lavoisier, Laplace, and Gay-Lussac that heat was a special type of substance caloric, an imponderable fluid passing from one body to another. In its place came a revival of the ancient atomist idea that heat is attributable to the motions of very small bodies. Though the doctrine that all matter is composed of tiny indivisible material (“ponderable”) particles remained controversial, in the late 1850s the concept of kinetic atom gained acceptance. These invisible “spheres of action” were regarded in various ways: as mental fictions or heuristic devices, as centers of repulsive forces or as physically real tiny indivisible particles. Possessing minimal material properties of size and motion, the kinetic atom became the central concept of the kinetic theory of heat, that a body’s temperature is a measure of the kinetic energy, or energy of motion, of its hypothetical microconstituents. Emil Wiechert’s and J.J. Thomson’s 1894–1897 discovery of electrons, “atoms of charge”, led by the turn of the century to the first models of an electrically neutral atom, a composite structure whose stability was explained by the attraction of oppositely charged constituent particles. Maxwell’s electromagnetic theory, spectacularly confirmed by Hertz’s wireless experiments in 1887, implied that charged moving bodies emit electromagnetic radiation. Lying at the intersection of the kinetic theory of heat and Maxwell’s theory are the phenomena of thermal radiation, electromagnetic radiation generated by the vibrational motions of charged particles in bulk matter. Investigations of thermal radiation in the last decades of the 19th century led to the quantum theory.

On the joint basis of Maxwell’s theory and the kinetic theory, in the autumn of 1900 Max Planck attempted to derive the recent empirically established frequency distribution law of thermal radiation for a so-called blackbody. In so doing, he found he was compelled to introduce the notion of a discrete minimal unit of energy. It is customary to say that Planck thereby “discovered” the quantum. Planck later on modestly described his brainchild as “an act of desperation”, a mathematical stratagem needed to obtain what he knew to be the correct answer. He proved a highly conservative revolutionary – so much so that Thomas Kuhn’s 1978 study of Planck’s reluctant embrace of the physical reality of energy quanta credits Einstein, not Planck, for the birth of the quantum theory. Kuhn argued that the quantum revolution really began with Einstein’s paper of March 1906, pointing out that Planck’s derivation, based on classical theory yet incorporating the above stratagem, was strictly speaking inconsistent. Nonetheless, Einstein observed, the derivation succeeded through implicit use of the non-classical light-quantum hypothesis he himself introduced in 1905. While historians of physics can disagree about the respective merits of Planck or Einstein as father of the quantum revolution (in no small measure due to genuine obscurity concerning what Planck actually believed), there is widespread recognition and general agreement that Einstein played a unique and revolutionary role in advancing and extending the quantum hypothesis, first in the theory of radiation and then pursuing it into the theory of matter.

To appreciate Einstein’s role in creating the “old quantum theory” in the next chapter, it will be useful to first sketch the situation of late-19th-century physical theory that Einstein inherited. This chapter briefly surveys the rise of thermodynamics, the kinetic theory, its application to gases and statistical generalization by Boltzmann, and the developments leading up to Planck’s radiation law. The next concerns how Einstein formulated then wielded what he termed “Boltzmann’s principle” in demonstrating the quantum hypothesis was here to stay. The story here begins with the rise of thermodynamics in the 19th century and subsequently its statistical foundation by Boltzmann.

Historical overview of the concept of energy

The concept of energy has a long history; precursors appear in Greek antiquity and again in the Middle Ages. But it was Leibniz who foreshadowed the modern concept of energy in 1686 with his notion of “living force” (vis viva). In the process he gave rise to a century-long dispute about whether vis viva (2) or linear momentum () is the appropriate conserved quantity in mechanics.2 The concept of energy is present in all but name in the 18th-century dynamics of d’Alembert and Lagrange, although to them it is not a generally conserved quantity. The general doctrine of energy conservation then emerged in the early 19th century from a combination of factors, among the most important being French engineers’ interest in the theory of machines, the idea that all physical phenomena might be reduced to conservative mechanical processes, and the discovery of the interconvertibility of various forms of energy.

The advent of steam engines in the Industrial Revolution brought a generalization of the energy concept from mechanics to the phenomena of heat and the definition of energy as “the capacity of a physical system to do work”. At the same time heat came to be regarded as a form of energy convertible into work. In a notable 1959 paper on simultaneous discovery in science, Thomas Kuhn pointed to the principle of energy conservation as a prime example.3 Between 1842–1847, a group of researchers, each working largely in ignorance of the others, proposed that within an isolated system different forms of energy might be transformed from one kind to another, but total energy is conserved by these processes. Synthesizing these results in 1847, Hermann von Helmholtz formulated a law of conservation of force (“Kraft”), a term not yet with a univocal meaning though soon to be understood as “energy”. Shortly afterwards, William Thomson, the future Lord Kelvin, coined the term “thermo-dynamics” in 1851 for the new science of the study of the effects of work, heat, and energy on a physical system.

The idea of transformational invariance appears central in establishing the concept of energy; that heat is a form of energy and that the internal energy of a system cannot be created or destroyed but only transformed from one form to another led to the first law of thermodynamics. On the other hand, Ernst Mach, in an influential 1872 monograph on the history of the principle of conservation of energy,4 argued that the actual root of the energy concept lay, as Helmholtz had maintained, in the historically attested record of failures to produce a perpetuum mobile, a machine that once started, would continue to run indefinitely without any external input. While this apparently restricts the concept of energy to mechanics, Mach insisted that the prohibition of perpetual motion arose in the course of human experience long antedating mechanics and hence logically distinct from it. Einstein followed Mach in similarly regarding the impossibility of perpetual motion as a “principle”, the indisputable empirical fact underlying thermodynamics.5

Kinetic theory

Thermodynamics, a science of the macroscopic properties of matter, developed rapidly in the first half of the 19th century. Its results are largely independent of hypotheses about underlying dynamics and microstructural composition, among which the most influential was physical atomism, a speculative hypothesis pertaining to invisible, indivisible corpuscles.6 But from the early 19th-century atomism of another type, the “chemical atomism” (1808–1810) of John Dalton (1766–1844), was quickly accepted as it proved successful in explaining the respective weights of chemical elements in compounds. If chemistry suggested the existence of atoms of the chemical elements, the experimental data represented in the gas laws due to Boyle (1627–1684) and Mariotte (1620–1684) showing that the pressure of air is proportional to its density, suggested that the elasticity (“spring”) of air is due to the motions of tiny bodies, although Boyle himself was critical of Greek atomism and Mariotte disliked atomism altogether. Molecules of air were assumed in the prototype kinetic theory of Daniel Bernoulli (1700–1782) in 1738. Bernoulli correctly understood that air pressure within a closed cylinder supporting a weighted piston is due to the repeated impacts of vast numbers of rapidly moving “very minute corpuscles”. However, despite Bernoulli’s derivation of Boyle’s law relating the pressure and volume of a gas, there were competing theories of gas pressure and for a considerable time, no agreement on the nature of heat and temperature. Only in 1854 did Kelvin propose a generally accepted absolute temperature scale.7 The kinetic-molecular theory remained incomplete. For almost a century, further progress in developing the concept of heat as molecular motion was delayed.

Bonn physicist Rudolf Clausius (1822–1888) formulated the two initial laws of thermodynamics. The first is the principle of conservation of energy that Clausius applied to heat and thermodynamic processes. Expressed in the form of a prohibition, the first law of thermodynamics states the impossibility of a machine that can perform work without modifying the environment, a so-called perpetuum mobile of the first kind. Clausius’s initial version of the second law affirmed that heat cannot pass from a colder to a warmer body without compensation in the environment, as for example by a refrigerator’s compressor. Another formulation, deriving from Sadi Carnot’s 1824 theory of the limits of efficiency of heat engines, is that there is a fixed upper limit to the amount of work obtainable from a given amount of heat. This effectively states the impossibility of a perpetuum mobile of the second kind; namely, the impossibility of building a machine that produces work in a cycle of operation by borrowing heat from a single source (in Kelvin’s formulation). Clausius subsequently gave the second law quantitative expression, in the process introducing the notion of entropy, the most characteristically thermodynamic concept, a more abstract notion lying further away from sense experience than volume or pressure or temperature. Designating entropy by the letter S, Clausius’s quantitative statement of the second law is an inequality holding that the incremental change in entropy in a closed system cannot be negative.8

The development of the concept of energy in the 1830s and 1840s and the new thermodynamics around 1850 set the stage for the kinetic theory of gases by the end of the 1850s. In the famous characterization of Clausius, molecular motion is “the kind of motion that we call heat”. Between 1855–1865, Clausius and Maxwell were able to derive the law of perfect gases, assuming molecules freely flying around within a containing vessel except for occasional elastic collisions among themselves and with the walls of the container. Temperature was thereby interpreted as the average or mean kinetic energy (or velocity) of the particles of the gas. In this way the kinetic theory explained the established empirical regularities between the macroscopic thermodynamic quantities in terms of the random motions of molecules.

The next step was to similarly account in kinetic terms for the two laws of thermodynamics, both regarded as exceptionless absolute laws by Clausius. Kinetically, entropy is a measure of the state of molecular disorder in a macroscopic system: e.g., the molecules of a gas within a container may be mostly bunched in one corner (low entropy) or distributed more or less uniformly throughout the container’s volume (high entropy). But the statement that entropy never decreases posed a difficult philosophical problem for the kinetic theory according to which matter is composed of atoms in continual motion and subject to the laws of Newtonian mechanics. The basic equations of mechanics are both exceptionless and time-reversible; they do not change their form in the least if the sign of the time variable is reversed (making the substitution t →t). As Newton’s second law,

F =m a =m d 2 x/d t 2

contains the second derivative of the time variable, any solution of Newtonian equations can be transformed into another solution with the time variable reversed and the system represented as going “backwards in time” (as in heat flowing from a cold body to a hot one). How then can time-reversible laws governing the motions of the particles of a gas be reconciled with the apparently irreversible macroscopic increase of entropy posited by the second law? Could any reconciliation be made? Initially, only the far-sighted Maxwell recognized the fundamentally statistical aspect of the second law;9 in a letter to Peter Guthrie Tait of 1867, Maxwell explained how the second law could be hypothetically violated by a “demon”, a being with finite attributes that without expending any work at all could selectively segregate the “sufficiently fast” molecules of a gas from all others, so that a hot region of a gas grew hotter and the cold region colder, in violation of the second law. A less fanciful illustration of Maxwell’s understanding was given in an 1871 letter to John Strutt, the future Lord Rayleigh, in which Maxwell remarked that the second law had “the same degree of truth” as the statement that a glass full of water thrown into the sea cannot be recovered. However, it was the Austrian Ludwig Boltzmann (1844–1906) who formulated the statistical explanation of the second law that remains today. Boltzmann’s interpretation of the second law will be essential to Einstein’s contributions to the early quantum theory.

Enter Boltzmann

Clausius, then Maxwell, introduced statistics into the study of the properties of a multi-particle system, in particular, a dilute ideal gas at thermal equilibrium in a container. From statistical arguments, Clausius deduced the probability that a gas molecule could travel on average a certain distance, a “mean free path”, without collision. To do this, he assumed for simplicity that the molecules of the gas moved in random directions but with the same average velocity.

In his first paper on kinetic theory in 1860, Maxwell introduced the idea that molecular velocities are distributed according to a well-defined law, today known as the Maxwell-Boltzmann distribution. Maxwell treated velocity as a variable whose values are distributed randomly, attaching probability not to “mean free path” but to every molecular state of motion. The probability that a molecule of the gas possessed a definite kinetic property, e.g., a velocity between v and v + dv or energy between E and E + dE, is then a relative probability, expressed as the ratio of the average number of particles of the gas with this property at a given time to the total number of particles. This information is summarized by a distribution function, indicating the velocity spread of the molecules of a gas and their most probable speeds at various equilibrium temperatures. The distribution is identical in form with the “normal distribution” bell curve of the Laplacian theory of errors and is known in statistics as a Gaussian. It is the basis of the kinetic theory and the first statistical law in physics.

Strictly speaking, Maxwell’s distribution applies to an ideal gas of fictional particles moving freely within a closed container, and without interactions other than brief collisions that elastically exchange energy and momentum. More generally, the Maxwell distribution function is applicable wherever a system of classical particles has established thermal equilibrium at absolute temperature T as the result of collisions among themselves and with the walls of the container. Boltzmann subsequently sought to understand how the gas approached this equilibrium distribution, and in 1872 used kinetic theory together with Maxwell’s implicit assumption that each pair of colliding molecules is uncorrelated prior to collision (the so-called “Stoßzahlansatz”), to prove that the Maxwellian velocity distribution is the unique stationary (equilibrium) distribution to which the molecules of a more realistic model gas (polyatomic molecules, subject to external force fields like gravity) would tend over time due to the laws of mechanics. As the Maxwell/Boltzmann distribution function occupies an important place in Einstein’s contributions to statistical physics, note that it contains the exponential function, exp (−E/kT) where E is the energy of a molecule, T the temperature, and k a bridging constant between microphysics (molecular mechanics) and macrophysics (thermodynamics) now named for Boltzmann.10 A key contribution of Einstein to the emerging quantum theory will be to show the physical significance of the new constant.

From the year he received his PhD (1866), Boltzmann set out to interpret the second law in the context of the kinetic theory. In 1872 he showed that the Newtonian laws of motion applied to the particles of a dilute gas, plus the Stoßzahlansatz, suffice to show that in finite time every non-equilibrium distribution of molecules will approach the Maxwell-Boltzmann equilibrium distribution where entropy is a maximum. To do this, Boltzmann constructed a mechanical function H of the velocity distribution, to be interpreted as the negative of the thermodynamic entropy function S. Boltzmann’s “H theorem” is then the inequality dH/dt ≤ 0. It is allegedly a demonstration that a gas beginning in any non-equilibrium velocity distribution monotonically moves closer and closer to equilibrium and the state of highest entropy: an irreversible process obtained from the time-reversible laws of mechanics. Objections to the H theorem from Boltzmann’s former teacher and Graz colleague Josef Loschmidt (1821–1895) and from the British kinetic theorists over the course of the ensuing decade pushed Boltzmann to increasingly appreciate the important role that statistical concepts play in the theory of gases, as Maxwell had emphasized.

The turning point came in 1876 when Loschmidt pointed to an insurmountable difficulty: from fully specified initial conditions, an increase of entropy could be derived according to Newton’s laws (applicable to the motions and collisions of molecules of a gas), but equally a decrease of entropy could be derived by those same laws from the final state by simply reversing each molecule’s velocity. An irreversible approach to equilibrium and associated monotonic increase of entropy cannot be derived from the time-reversible laws of mechanics without introducing additional probabilistic considerations. Boltzmann subsequently sought to show how macroscopic irreversibility could be understood statistically in a gas comprised of an enormous number of molecules, even though the motions of individual molecules follow the time-reversible laws of mechanics. He interpreted the empirical fact that thermodynamic systems evolve towards thermal equilibrium as the overwhelming tendency of such systems to evolve from less probable low entropy states towards the most probable state, the state of maximum entropy, i.e., thermal equilibrium. As applied to the universe as a whole, this account required the assumption of a very improbable universal initial condition, a problem that still remains in cosmology. For particular isolated systems, however, Boltzmann’s understanding of the second law became a cornerstone of statistical mechanics.

Statistical mechanics

In everything but name, statistical mechanics began in the late 1860s with a series of memoirs in which Boltzmann introduced various relations between entropy and probability. The most famous of these is from 1877, entitled “On the Relation Between the Second Law of Thermodynamics and Probability Calculus”. Published, as were many of Boltzmann’s papers, as a memoir in the Proceedings of the Viennese Academy of Sciences, it radically departs from Boltzmann’s earlier kinetic approach; it opens with the statement that the probability of a molecular distribution could be determined in a way “completely independent of whether or how that distribution came about”. Without recourse to kinetic hypotheses or assumptions about molecular collisions, the new method derived a probability distribution for the different thermodynamic states of a gas by simply counting the distinct number of ways each such macrostate might be realized by the particles of the gas. Each distinct way is called a “complexion” and a new complexion (a microstate specifying at a given instant the positions and velocities of each of the N particles of the gas) arises by as simple a process as merely permuting the speeds of two of the gas’s N molecules (e.g., give B’s velocity to particle A, and vice versa), leaving the overall thermodynamic state unchanged. The paper marked a considerable departure from Maxwell’s understanding of the use of probability considerations in physics. Whereas Maxwell justified the introduction of probability methods into physics by the human (i.e., non-demonic) impossibility of specifying the exact state of each molecule of a gas, Boltzmann essentially regarded such detailed information as unnecessary for the purpose of providing a kinetic foundation for thermodynamics. What mattered in accounting for the particular values of thermodynamic variables is simply the sheer number of distinct “complexions” associated with a given value of a thermodynamic variable: that vast number is directly proportional to the probability that the system is in that thermodynamic state.

Although later on in his 1877 paper Boltzmann introduced more realistic systems (again, polyatomic molecules possessing more degrees of freedom and subject to external forces), he used a simple model, an “unrealizable fiction”, to bring out the fundamental idea. Consider a system consisting of a large but finite number N of identical and distinguishable “rigid absolutely elastic spherical molecules trapped in a container with perfectly elastic walls”. Taking into account only the total energy E of this “gas” of fictional molecules isolated in a container, the next step was to distribute or partition E, a thermodynamic observable, over the N molecules. Energy is classically a continuous variable, so this meant distributing a continuous quantity over an enormous yet finite number of objects. Boltzmann accordingly stipulated that the kinetic energy of each particle had to be one of the discrete energy values: 0, ε, 2ε, 3ε, …, . The total energy E of the gas is then an integral multiple of ε distributed over the N particles by an energy-partition (Energievertheilung), i.e., numbers w0, w1, …, wr, such that w0 particles have energy 0, w1 particles have energy ε, w2 particles have energy 2ε, and so on, where Σ wi= N. An energy-partition is thus a particular state description, a distribution of molecules according to the coarse scale of discrete energies. The energy element ε was given no physical but merely a computational meaning (Boltzmann chose it small enough so that the end result would not depend on its value) in determining the number of complexions corresponding to the given state description. Max Planck will borrow this method of discretizing a continuous variable in 1900 when using Boltzmann’s statistical method in deriving the radiation law that bears Planck’s name.

A given thermodynamic state (that is, for fixed E) then corresponds to a vast number of distinct complexions. Since two distinct complexions arise merely by permuting molecules, Boltzmann’s method of counting complexions implicitly assumes that the molecules, though identical in all respects, nonetheless can be distinguished from one another. It may well be asked – and this will prove crucial later – how particles can be identical yet distinct from one another. Suppose one goes into a room where there is only a table on which sit two boxes, each containing a small black ball. The balls and boxes are identical in every respect but can be given different names since their placement allows them to be perceptually distinguished: A is in the box to the observer’s left, B in the box to the right. Suppose one leaves the room and returns at a later time. Again two boxes, each containing a black ball. But does one really know that A remains in the box on the left, or might it be B? Unless one has supplementary information, say, by continuously monitoring the interlude with a CATV monitor, the latter has to be considered as not impossible. Hence two balls in two boxes at two different times give rise to two situations, identical yet distinct. It is not one situation encountered twice. Identical but distinguishable particles is a key implicit assumption of Boltzmann’s 1877 statistical method. Later on, in his 1897 lectures on mechanics (underlying the theory of gases), Boltzmann made this assumption explicit, calling it the “first assumption of mechanics”, a consequence of “the law of continuity of motion”, that a mass point is identifiable as the same point at two different places at two distinct times by its continuous trajectory connecting them. As we shall see, according to standard quantum mechanics, quantum particles do not have determinate trajectories, and in 1924 Einstein will show that the statistics of one species of quantum particles (bosons) accordingly require a different method of counting that corresponds to the fact that bosons are identical but indistinguishable particles.

On the assumption of identical and distinguishable particles, a huge yet definite number of distinct complexions correspond to each thermodynamic state of the gas. The total number P of complexions associated with that state description is computed using combinatorics. As P is also the number of permutations of N molecules associated with that description, Boltzmann termed P the “permutability”, computing it by the factorial formula,

P = N!/w0! w1! w2!

On the assumption that each distinct complexion is equally probable, Boltzmann set the probability W (Wahrscheinlichkeit) of finding the gas in a given thermodynamic state as proportional to P. The most probable distribution for fixed E and fixed N will be those w0, w1, w2, …. values for which P is a maximum. Probability here means relative probability, the ratio of the number of distinct possible complexions corresponding to a definite state description to the vast total number of all complexions possible within the energy constraints on the system. As the numbers involved are so enormous, Boltzmann could use a well-known mathematical approximation (Stirling’s formula) for the factorials involved, allowing the most probable distribution to be represented by an exponential function.11

The relationship between the entropy S of a system and its probability (W) is then as follows. Entropy is an additive quantity: the entropy S of two separate systems A and B that are brought into contact is the sum of their entropies, SAB = SA + SB. But probability is not additive: two systems that have not yet interacted must be considered independent, and therefore the probability W of occurrence of the joint AB system is the product of the probabilities of the two systems separately, WAB = WA WB. The mismatch requires the proportionality between entropy and probability to be logarithmic.12 First written down in this form by Planck, who named the constant k of proportionality Boltzmann’s constant, the statistical form of the second law ornaments Boltzmann’s tombstone in Vienna,

S = k log W.

The expression essentially states that a system with relatively fewer equivalent ways to arrange its components has relatively less probability W (and so entropy); alternately the one with relatively many equivalent ways of doing so (larger W) has relatively more entropy.

In defining the entropy of a thermodynamic state by the measure of its probability, Boltzmann extended the second law beyond the domain of pure thermodynamics into statistical mechanics. Boltzmann’s “statistical mechanics” (a term coined by Yale physicist Willard Gibbs in 1902) thereby created the modern understanding that the second law is not an absolute law but is merely statistical: for a given time lapse, entropy increase in closed systems occurs in the vast majority of initial conditions, but not all of them. The use of the logarithm of P rather than P had another significant consequence. It enabled Boltzmann to show that in the simple case of a gas in equilibrium satisfying the Maxwell distribution, the entropy of the gas is essentially that calculated directly using only thermodynamic assumptions. That the Maxwell-Boltzmann velocity distribution for the equilibrium state could be derived from this new point of view showed not only that it is the unique stationary distribution, but that it is also the most probable.13 Still, in the first decade of the 20th century, the statistical meaning Boltzmann accorded the second law was highly controversial. Planck, then a rising young theorist who had already built an international reputation on the basis of his lectures on thermodynamics, would not accept Boltzmann’s statistical interpretation until 1914.

Blackbody radiation

According to Maxwell’s dynamical theory of the electromagnetic field (1865), electromagnetic radiation is the emission of energy in the form of waves; up until the special theory of relativity in 1905, it was widely believed that these waves propagated through space in a material medium known since early in the 19th century as the luminiferous ether, a medium that the wave theory of light regarded necessary for the propagation of light (see Chapter 5). Many physicists on the continent, however, were skeptical of a theory that predicted effects of electromagnetic waves years before they were demonstrated in experiment. Only after Heinrich Hertz’s experiments at Karlsruhe beginning in 1885 did Maxwell’s theory gain widespread acceptance. Hertz demonstrated propagation of electric waves through space by fabricating an electrical oscillator and analyzing the surrounding field by means of small resonant circuits. In his original device, Hertz measured the wavelength (from crest to crest) as 66 cm; these are radio waves or what is now called the microwave spectrum. It was the first detection of electromagnetic waves with properties of light: frequency, refrangibility, interference, polarization, etc. Further experiments in England, Italy, and India were successful in generating waves of shorter wavelengths. By the end of the 19th century, the known spectrum of electromagnetic radiation had greatly expanded, including wavelengths of invisible infrared and ultraviolet rays adjacent to visible light but also much longer radio waves and much shorter X-rays. By 1900, the theory of radiation was fairly well encompassed by Maxwell’s theory, though doubts remained about the nature of X-rays.

As noted above, quantum theory originated through the investigation of thermal radiation, radiation emitted by a hot object. When heated, a body’s constituent particles vibrate more rapidly; if electrically charged, they emit radiation, electromagnetic waves carrying energy away from the body at a rate that increases with its temperature: the hotter the object, the brighter it appears. A heated iron bar, for example, will initially glow with a dull reddish light; as it becomes hotter, it will shine more and more brightly, emitting yellowish and eventually bluish-white light, spanning the visible part of the electromagnetic spectrum. All matter (above absolute zero on the Kelvin scale [−273.15 °C]) continually emits and absorbs thermal radiation. Beginning around 1860 the new science of thermodynamics was employed to investigate how the character of emitted thermal radiation depends on a body’s temperature, the motions of charged particles of the body’s constituent atoms and molecules.

Two initial theoretical results are important for what follows, the first by Gustav Kirchhoff and the second by Boltzmann. In Heidelberg, Gustav Kirchhoff (1824–1887), and his colleague Robert Bunsen (1811–1899) pioneered the spectroscopic method of chemical analysis. By refracting visible light emitted by a glowing object into its spectrum, the chemical composition of the object could be inferred. In 1859, Kirchhoff discovered the presence of sodium in the sun and within a few years had shown that the solar spectrum reveals the possible presence of at least nine more chemical elements. Those who, like Charles S. Peirce, scorn the erection of “barriers to inquiry” can take comfort in the fact the Kirchhoff’s results belied the 1835 prediction of positivist philosopher August Comte that the chemical composition of the stars could never be known to science.

More significantly for what follows, in 1860 Kirchhoff stated a remarkable theorem despite lacking adequate experimental evidence to back it up. Now known as Kirchhoff’s law of thermal radiation, it states that for any body, the ratio of its emissive power (Eν) to its dimensionless coefficient of absorption (Aν) is characterized by a function J of the mode frequency ν of the emitted or absorbed radiation and the body’s temperature,

Eν / Aν = J(ν, T).

Such a function is “universal” as it is independent of every other property of the body including shape or composition. A corollary holds that at constant temperature (i.e., in thermal equilibrium with its surroundings) a body will emit and absorb radiation at the same rate. Kirchhoff furthermore conceived of a theoretically useful fiction, a so-called blackbody that could in principle with perfect efficiency emit and absorb radiation of all possible wavelengths. A body of this kind is ideal for studying thermal radiation, the energy emerging from matter itself, as it absorbs all light of any wavelength incident upon it yet at thermal equilibrium emits radiation whose spectrum, or distribution of frequencies, is characteristic only of its temperature (in this case Aν = 1).

By the end of the 19th century physicists believed that thermal radiation was entirely electromagnetic in character. For this reason any putatively successful theoretical account of blackbody radiation would be fundamental, as it would draw upon two hitherto distinct domains of physics, thermal physics (thermodynamics and the kinetic theory of gases) and Maxwell’s theory of electromagnetism.

A perfect blackbody does not exist in nature (although the spectrum of the cosmic microwave background (CMB), discovered by Penzias and Wilson in 1964, is very close to that of a perfect blackbody). Still, Kirchhoff proposed that radiation emitted from a blackbody at a fixed temperature could be closely approximated by equilibrium thermal radiation within a cavity whose walls were kept at uniform temperature. The emitted radiation could be studied as it exited through a small hole in the cavity. Because blackbody radiation emitted by any substance has the same frequency signature at a given temperature, it is distinct from the line spectrum of the discrete band of frequencies characteristic of particular substances studied in spectroscopy. It is ideally simple – homogeneous, isotropic and unpolarized – and it can be characterized by the function ρ(ν, T) giving the spectral or energy radiation density, i.e., the energy per unit volume, and per unit frequency interval (ν + ) of blackbody radiation of frequency ν at temperature T. Determining the graph of this function led to the first quantum concepts.14

Conceived as a theoretical ideal in the 1860s, by the 1890s experimenters in Berlin had devised objects whose emitted radiation approximated the theoretical blackbody spectrum to within a few percent. This required, among other challenges, the development of radiation detectors of great sensitivity and of ways of extending measurements over higher and higher frequency domains and temperatures. The first successful design and actual fabrication of a physical body closely approximating a blackbody was made by two German physicists, Wilhelm Wien (1864–1928) and Otto Lummer (1860–1925), who worked at the newly established (1887) Physical-Technological Imperial Institute (Physikalisch-Technische Reichsanstalt, or PTR) in Charlottenburg (after 1920, a district of Berlin). Founded by the German government at the initiative of inventor and industrialist Werner Siemens (1816–1892), the illustrious Hermann von Helmholtz (1821–1894) was the PTR’s first director; later on, from 1917–1933, Einstein was a member of the PTR’s board of directors. Tasked with setting calibration and metrological standards for German industry, the PTR quickly became the world center for the development and study of high-temperature technologies. Since measurement of high temperatures requires precise measurements of thermal radiation, it also became a natural place for investigation of the theoretical problem of blackbody radiation.

Theorists could build upon several attested experimental laws. First was Austrian physicist Josef Stefan’s 1879 determination that the total energy emitted by a hot body, i.e., the density of thermal radiation of all frequencies at a given temperature E(T), was proportional to the fourth power of the temperature. In 1884 his former student, Boltzmann, then professor of experimental physics in Graz, showed that Stefan’s experimental law is valid only for blackbodies and succeeded in deriving it from thermodynamic and electromagnetic assumptions. The so-called Stefan-Boltzmann law states that the total energy E(T) emitted at all frequencies by a blackbody is proportional to the fourth power of the absolute temperature (in degrees Kelvin),

E(T) = aT4. (Stefan-Boltzmann).

Here a is a new constant, the Stefan-Boltzmann constant that, it would turn out, could be expressed in terms of two even newer ones, the one Planck termed Boltzmann’s constant k and the constant h that became associated with Planck’s name.

A similar mixture of thermodynamic and electromagnetic reasoning enabled Wien to derive his displacement law in 1894 showing that as temperature rises, the blackbody spectrum is simply rescaled according to the rule ρ(ν, T) = ν3f (ν/T), with f as a universal function of the ratio ν/T only. In conjunction with the first experimental measures of the spectral distribution of blackbody radiation conducted at the PTR in the early 1890s, and by analogy with Maxwell’s exponential law for the distribution of velocities in a gas, the displacement law allowed Wien to theoretically propose a form for ρ(ν, T) that fit the then-known experimental data quite well:

ρ(ν, T) = αν3eβν/T, (Wien)

wherein α and β are two constants. Wien’s “discoveries regarding the laws governing the radiation of heat” at the PTR merited the Nobel Physics prize in 1911. As will be seen, however, in autumn 1900 two teams of experimentalists also working at the PTR demonstrated that even though Wien’s law correctly described blackbody radiation of short wavelength and thus high frequencies, it broke down in the low-frequency (infrared) range of the radiation spectrum.

Another proposal for determining the function ρ(ν, T) describing the frequency distribution of blackbody radiation came from Britain in 1900; it is associated with the names of John Strutt, the third Lord Rayleigh (1842–1919) and James Jeans (1877–1946). Rayleigh, then Cavendish professor at Cambridge, was an expert in optics, having explained the blue color of the daytime sky with his eponymous formula for the scattering of sunlight by molecules of the constituent gases of the Earth’s atmosphere.15 He now considered the problem of determining ρ(ν, T) analogous to that of determining the thermal equilibrium of all modes of vibration of the ether, considered an elastic solid. The analogy appeared to call for a straightforward use of “the Boltzmann-Maxwell doctrine of the partition of energy”. A result of statistical mechanics, the equipartition theorem states that in any system of a large number of particles at equilibrium (constant temperature), the available energy is distributed equally (on average) among all modes of motion or degrees of freedom (say, translational motion along the x-axis, or rotational motion about the z-axis); the energy of each degree of freedom is ½ kT where T is the absolute temperature.16 But Rayleigh presumably recognized that in blackbody radiation there are infinitely many degrees of freedom of high frequency, so unrestricted use of the equipartition theorem would lead to an infinite concentration of energy at the high frequencies. He accordingly restricted its application, proposing a result that applied only to “the graver modes”, i.e., low frequencies, while keeping the total energy finite. The following expression (though with an erroneous factor, corrected by Jeans in 1905) is then valid in the limit of low radiation frequency and high temperature,

ρ( ν, T ) = ( 8π ν 2 c 3 ) kT, (Rayleigh-Jeans)

where the coupling term is determined by Maxwell’s theory, kT = Ē is the mean total energy, and k is again Boltzmann’s constant relating energy (at the individual particle level) to temperature.

It is worth lingering a moment on the form of this expression: the first factor on the right (containing c) is a consequence of Maxwell’s electromagnetism, the second factor (with k) of the kinetic theory of heat. When integrated over all frequencies, this formula evidently leads to an absurd result, infinite energy of the radiation. In Rayleigh’s eyes, this circumstance was not even worth noting, for he expected the equipartition theorem to fail at high frequencies and at low temperatures, analogous to similar violations observed in gases. Of primary interest is the form of the two laws, of Wien and of Rayleigh-Jeans. Wien’s law broke down in the region of low frequencies, Rayleigh-Jeans in the region of high frequencies. Planck’s correct formulation of the radiation law will combine them. Einstein will show in 1909 that Wien’s law has a “particle” character while Rayleigh-Jeans’s has a “wave” character.

Enter Planck

Kirchhoff had moved from Heidelberg to the University of Berlin in 1875 to take up the newly established chair in theoretical physics, the first in Germany. Upon his death in 1879, the prestigious chair was first offered to Boltzmann who declined, and then to Heinrich Hertz who also declined. Only in 1889, after a vacancy of nearly ten years, was the chair next offered to Max Planck (1858–1947), who accepted. The advanced experiments on thermal radiation at the PTR in Charlottenburg made Berlin a fertile location for Planck.

Planck has been aptly caricaturized as “believing in God, in Germany, and in the absolute validity of the two principles of thermodynamics”.17 Through self-study, Planck was a disciple of Clausius and like Clausius, Planck regarded both laws of thermodynamics as absolute and exceptionless. Indeed, the first law, the principle of conservation of energy, is still generally regarded as absolute today, though its exceptionless status was famously challenged in a 1924 paper of Bohr, Kramers, and Slater (to which Einstein responded critically, see Chapter 4). Already in his doctoral dissertation of 1879, Planck displayed a preoccupation with the second law of thermodynamics. Planck viewed the status of the two laws of thermodynamics quite differently. Whereas the initial and final states of a process in nature were equivalent from the standpoint of the first law, this was not true in the second law, a fact signaling the direction in which processes must take place. Within a few years Planck deliberately reformulated Clausius’s version to bring out the strong presumption of irreversibility: all isolated systems move irreversibly from states of lower to states of higher entropy. Whereas Maxwell and Boltzmann deemed the second law to be merely a statistical law pertaining to unimaginably large numbers of particles and so subject to exceptions, Planck could not accept that the law of entropy increase was not immutably valid. For this reason he opposed the attempt to base the laws of thermodynamic evolution and approach to equilibrium on the time-reversible propositions of mechanics. For the same reason he was a critic of Boltzmann’s atomism and its role in the chain of reasoning leading to the unsavory conclusion that the second law is not absolute.

In 1895, Planck embarked on a highly ambitious program to tackle the riddle of blackbody radiation, the puzzle of why any object regardless of composition, maintained at constant temperature, emits the same exact spectrum of radiation. He began with thermodynamics, and in particular the second law, hoping to show that irreversibility could be theoretically derived using only the resources of electrodynamics, i.e., Maxwell’s theory augmented to include interactions between field quantities (electric, magnetic, current) and their material sources. Thermal radiation was to be understood exclusively as an electromagnetic phenomenon. Planck sought to tackle two problems at once: appeal to electromagnetic interaction between matter and radiation to explain both thermodynamic irreversibility and the growing experimental data on blackbody radiation. His initial idea, roughly stated, was that the particular mechanism producing irreversibility is the transition of an electromagnetic wave from an incident plane wave in the process of absorption to a spherical one in the process of emission. Boltzmann, recalling his lesson from Loschmidt, at once pointed out that Planck’s strategy could not by itself succeed in explaining irreversibility since the laws of electrodynamics also permitted the time reverse of an emitted spherical wave. Like the laws of Newtonian mechanics, those of electrodynamics are completely time-reversible.

Chastened by this exchange with Boltzmann, Planck over the next decade or so adopted various versions of a hypothesis of “elementary disorder” to account for thermodynamic irreversibility. He sought to show that the statistical form of the second law had to be a mathematical consequence of an elementary randomness, i.e., of an a priori initial independence of the individual elements when considered statistically. In this way the second law retained its absolute status as a necessary implication of a persistent disorder. Not until 1914, after the introduction of the idea of a quantum of energy, after the apparent successes of the Bohr atom, and after the majority of his theoretical colleagues, did Planck accept that the law of entropy increase is not an absolute law. Still, Planck praised Boltzmann even in his early work and despite his antipathy to atomism for emancipating the concept of entropy from anthropomorphic ideas of machines and the art of human experimentation.

Planck’s radiation law

Recall that a blackbody absorbs all incident radiation and emits a spectrum of radiation characteristic of its temperature, while at thermal equilibrium it will emit and absorb radiation at the same rate. In the 1890s, researchers at the PTR had accumulated reliable data on the blackbody spectrum, the distribution ρ(ν, T) of electromagnetic energy as a function of frequency and temperature, within the range of all temperatures then experimentally accessible. The first task was to find a model of the interaction between radiation and oscillating charges that fit the experimental data. Planck employed a simple fictional construct of matter emitting and absorbing radiation used by Hertz some six years before in his own studies of the interaction of matter and radiation. The model treated the inner surfaces of the blackbody cavity as composed of charged masses that, in response to the electromagnetic radiation incident upon them, oscillate or vibrate at a certain definite frequency, each interacting with one, and only one, color of light. Since an oscillating dipole emits electromagnetic radiation, these simple harmonic oscillators or “resonators”, as Planck (following Hertz) called them, were ideal abstract processes that could be justified by Kirchhoff’s law, that the radiation distribution at equilibrium is independent of the particular type of system interacting with radiation. So Planck could regard his resonators as merely the simplest conceivable material systems that can be in equilibrium with electromagnetic radiation but as lacking any specific physical meaning and in particular as independent of any controversial molecular or atomic hypotheses. As the equilibrium spectrum did not depend on the specific thermalizing system, the model could be considered sufficient to determine the frequency distribution of blackbody radiation.

Planck attacked the problem of deriving the spectral energy density of electromagnetic radiation (i.e., determining the function ρ(ν, T)) by relating the energy of the emitted electromagnetic radiation to the (average) energy Ēν at a given temperature of emitting and absorbing resonators of frequency ν. By 1899 Planck had derived from Maxwell’s theory and from a random-phase assumption for the resonators the following relationship between the distribution function ρ(ν, T) and Ēν,

ρ( ν,T )=( 8π ν 2 c 3 ) E ν ( T ).

Planck was almost done: in order to have the explicit form of the distribution law, he needed only to determine Ēν(T), the average energy of a resonator of frequency ν at temperature T. He assumed the resonator energy Ēν to have an exponential form needed to retrieve Wien’s distribution law for the density ρ(ν, T). At this point, Planck was confident that he essentially had realized the chief aims of his grand program: a proof of the irreversible evolution of thermal radiation in a cavity, and a determination of the equilibrium spectrum of this radiation.

Alas, in the autumn of 1900 new experiments at longer wavelengths by Heinrich Rubens and Ferdinand Kurlbaum showed that Wien’s law matched observations in part but broke down at the low-frequency (infrared) end of the radiation spectrum. Planck then recognized a certain arbitrariness in his previous entropy considerations and sought the simplest generalization of Wien’s law. On October 19, 1900 he proposed what turned out to be the empirically correct radiation distribution law for all temperatures then technologically realizable in the laboratory. Using notation introduced several months later, Planck’s law of the distribution of blackbody radiation by frequency is

ρ( ν,T )=[ 8π ν 2 c 3 ]hν /( exp ( hν/kT ) 1 )( Planck )

where h is a new extremely small physical constant subsequently named after Planck and k is the constant Planck named after Boltzmann.18 Planck effectively obtained this expression by introducing h as a parameter in an interpolation formula designed to fit the experimental results of Rubens and Kurlbaum. Between October 19 and December 1900, Planck undertook the important next step of determining the formula’s “true physical meaning”, deriving it from existing theory. He did this by analogically following Boltzmann’s 1877 paper, in which the second law was first given a statistical interpretation. As noted above however, Planck himself did not accept the statistical interpretation of the second law until 1914, so his use (or misuse) of Boltzmann was both partial and selective. The rather twisted path by which Planck derived his radiation law can be presented briefly here.

At thermal equilibrium, it was possible to consider the average energy Ēν of an individual resonator as constant and characteristic of the resonator, acting only on radiation of frequencies very close to its own resonance frequency and related to the spectral density ρ(ν, T) at this frequency through the above-mentioned relation,

Ēν = (c3 ρ)/8πν2.

Planck could have calculated Ēν from the equipartition theorem of statistical mechanics. Planck, however, was even less inclined than Rayleigh to apply equipartition to this problem. Now, rather than considering the relationship between energy and temperature, Planck focused on the relation between the entropy and the energy of a resonator that had played a central role in his earlier treatment of irreversible radiation processes. At this point, he appealed to the relation between entropy and probability and turned to the formula that Boltzmann never wrote down even though it adorns his grave in Vienna,

S = k log W.

Whereas for Boltzmann, the probability W of a thermodynamic state is just the number of complexions (atomic configurations and speeds) corresponding to that macrostate, Planck sought to determine W analogously by considering the number of “complexions” of his set of resonators. Deeming it sufficient to consider a system of N resonators all of frequency ν and energy Ēν, the total energy EN of the N resonators is EN = ν. Additivity of entropy meant that the entropy of the resonator system SN is just N times the entropy of each resonator S, SN = NS. SN is then set equal to the probability W via the constant of proportionality k,

SN = k log W.

Crucially, Planck assumed, as Boltzmann’s combinatorial method required, that any complexion was as likely to occur as any other. In order to obtain finite values for W, Boltzmann’s combinatorial formula required that the total energy to be shared among the N resonators be treated as consisting of an integral number of equal finite parts. So without using the term “quantum”, Planck introduced a discrete energy element ε, dividing the total energy EN by ε to get a whole number P of energy elements (he allowed rounding off to the nearest integer). P then had to be distributed over the finite number N of resonators,

EN = .

Thus Planck’s key step was to restrict the energy of each resonator so that it had to be an integral multiple of the energy unit ε, a tiny discrete amount. He insisted that his energy elements were strictly mathematical fictions, yet he determined a constant of proportionality h linking the energy element ε to the frequency ν of radiation,

ε = hν

that would become a fundamental equation of the quantum theory. Then the resulting expression of W, Boltzmann’s relation between S and W and the relation between the energy of a resonator and the spectral density of radiation together implied Planck’s law.

In Planck’s original view, these energy elements were important in defining the entropy of the resonators. But they did not imply an intrinsic discontinuity of the energy of the resonators, and they did not affect the interaction between radiation and resonators that Planck still treated by means of Maxwell’s equations. Ever the reluctant revolutionary, and rather than surrender the classical laws of electrodynamics as the basis for his theory of heat radiation, beginning in 1911 Planck sought to treat emission and absorption of radiation asymmetrically, in what became known as his “second theory”. Absorption occurred continuously, in accordance with classical laws. Discontinuity entered with emission of radiation, which occurred in discrete leaps. Planck supposed that his resonators could therefore take on any energy value and so did not intrinsically possess discontinuous energies, but could emit energy only when reaching some threshold in values of nhν. In this way, Planck was able to re-derive his radiation law of 1900. The explanation of discontinuity in the emission process was left to hypothetical details, differing in the different versions of the second theory, of the interacting microstructure of the matter comprising his resonators. When Bohr’s atomic theory of 1913 required discontinuous emission and absorption of energy, Planck’s second theory was rejected.19 However, one of its implications remains today: the existence of zero-point energy. Quantum objects and processes possess a non-zero average rest energy even at absolute zero temperature.

In any case, despite the apparent triumph of deriving the radiation law that bears his name, Planck regarded the constants k and h appearing in the law as of far greater significance. First of all, the values he gave them are remarkably accurate, within a few percent of their current values. But more importantly, Planck showed that they could be combined, together with Newton’s gravitational constant G, and c the speed of light in vacuo, to form a system of units of length and time that were completely “natural” in that they contain no reference to human measures. By multiplying and dividing different combinations of G, c, k, and h, one obtains what is now called the Planck time (5.39106 × 10−44 seconds) and the Planck length (1.616199 × 10−35 m), the tiny scale that current theory regards as the domain of quantum gravity. The same absolute measures could be found by any careful investigators anywhere in the universe, and in this sense are completely non-anthropomorphic. Planck’s striving for the absolute, though it didn’t work with the second law, paid off here in spades in giving rise to the ideal of a completely de-anthropomorphized physics of law and of fundamental natural units. This would be a unifying theme of Planck’s philosophy of science for the rest of his life.

In 1918, the last year of the Great War, the Nobel Prize in physics was awarded to Planck for “the advancement of Physics by his discovery of energy quanta”. In his Nobel address in June 1920, Planck modestly recalled his reluctance to accept the physical reality of the energy quantum. After all, it could be interpreted in two ways, first as just a “fictional quantity” in which case “the whole deduction of the radiation law was in the main illusory and represented nothing more than an empty non-significant play on formulae”. Or, secondly, one could regard “the derivation of the radiation law as based on a sound physical conception”. Planck continued:

“Experiment has decided for the second alternative. That the decision could be made so soon and so definitely was due not to the proving of the energy distribution law of heat radiation, still less to the special derivation of that law devised by me, but rather it should be attributed to the restless forward-thrusting work of those research workers who used the quantum of action to help them in their own investigations and experiments. The first impact in this field was made by A. Einstein”.20

Indeed, Einstein’s introduction of the quantum of action into his own theoretical investigations, beginning in 1905, proved instrumental in establishing the quantum. These investigations in early quantum theory are intimately related to the use of a guiding principle that Einstein termed “Boltzmann’s principle”, taken up in the next chapter.

Notes

1“On the Theory of the Energy Distribution Law of the Normal Spectrum”, as translated from Die Verhandlungen der Deutsche Physikalische Gesellshaft Berlin Bd. 2 (1900), in D. ter Haar (ed.), The Old Quantum Theory. Oxford, UK and London: Pergamon Press, 1967, pp. 82–90; p. 84.

2On the vis viva controversy, see Hankins, Thomas L., “Eighteenth Century Attempts to Resolve the Vis viva Controversy”, Isis v. 56, no. 3 (Autumn, 1965), pp. 281–97; and Smith, George E., “The Vis Viva Dispute: A Controversy at the Dawn of Dynamics”, Physics Today v. 59, no. 10 (2006), pp. 31–9.

3Kuhn, Thomas, “Energy Conservation as an Example of Simultaneous Discovery”, as reprinted in Thomas Kuhn, The Essential Tension: Selected Studies in Scientific Tradition and Change. Chicago: University of Chicago Press, 1977, pp. 66–104.

4Mach, Ernst, History and Root of the Principle of the Conservation of Energy. Translated from 1872 German edition by Philip E.B. Jourdain. Chicago: The Open Court Publishing Co., 1911.

5In fact, the conservation of energy cannot be derived from the impossibility of perpetual motion only, as Mach acknowledged under criticism from Planck. The impossibility of perpetual motion allows the possibility of producing work in a cycle of operations. But in order to get energy conservation or the first law of thermodynamics, one must also assume the impossibility of annihilating work.

6The idea that matter consists of invisibly small indivisible particles originated with the Greek atomists Leucippus and Democritus. During the Middle Ages atomism fell into disrepute on grounds of its close relation to atheism. In the mid-17th century, French priest and scientist Pierre Gassendi (1592–1655) revived ancient atomism while arguing its compatibility with Christian doctrine. Gassendi’s influence extended to the founders of mechanics in the 17th century, Galileo and Newton, who both assumed an atomistic theory of matter.

7Thomson, William, later Lord Kelvin, first proposed an absolute scale in 1847–1848 yet still assumed the conservation of caloric. The later 1854 definition drops this assumption. The guiding idea is that a temperature scale must be independent of any particular substance and it must relate intervals of temperature to specific quantities of heat and the equivalent mechanical effect produced. Its null point is absolute zero (-273.15 K, or -459.67° F), theoretically the temperature at which all classical thermal motion ceases. Absolute temperature is generally denoted with a capital T; unless otherwise indicated, in this book, “temperature” refers to this absolute scale.

8Often written dS/dt ≥ 0, although entropy is classically ill-defined in intermediate stages.

9“Thus molecular science teaches us that our experiments can never give us anything more than statistical information, and that no law deduced from them can pretend to absolute precision”. “Molecules”, Nature v. 8 (25 September 1873), pp. 437–41; reprinted in William D. Niven (ed.), The Collected Scientific Papers of James Clerk Maxwell, vol. 2, New York: Dover Publications Inc., 1965, pp. 361–77; p. 374.

10The constant k already implicitly appears in the kinetic-theoretical interpretation of the “ideal gas” law of Boyle & Gay-Lussac, pressure = (R × temperature)/volume. Where N denotes the number of molecules, the kinetic energy E of translational motion (in each of the three directions of space) per molecule is 3kT/2, where k = R/N.

11We must linger a moment upon a necessary detail that will prove essential to Einstein’s conception of probability. To extend this toy model to a real gas of molecules, Boltzmann had to consider the case in which the molecular-kinetic energies are continuous, not discrete. He did this by first representing the entire gas in a fictional “energy space”. The energy space is portioned into cells, so that the kinetic energy of each molecule is stipulated to lie within a cell ranging from 0 to ε, ε to 2ε, 2ε to 3ε, and so on. All N molecules are distributed among these energy cells so that the same combinatorial computation as above can be used. As Boltzmann further showed, if each complexion is to be equally probable, one must use what today is called “phase space” rather than the energy space. In phase space, every monoatomic molecule moving according to Newton’s laws of motion (in Hamilton’s formulation) is completely described by six generalized coordinates, three spatial components qi, and three momentum components pi. One can then represent the state of a gas of N particles in phase space in two distinct but related ways: at the macro- or thermodynamic level, the state of the entire gas is represented at a given instant by a single system point in a 6N dimensional space Γ; at the micro- or complexion level, the state of the entire gas at that instant is represented by N points distributed in disjoint “cells” in a six-dimensional space μ (the distinction is due to the Ehrenfests). (Ehrenfest, Paul, and Tatiana Afanasyeva, The Conceptual Foundations of the Statistical Approach in Mechanics; original German edition (1912), English translation by Michael Moravcsik. New York: Dover Publications, Inc., 1990.)
 The connection between the two spaces is established as follows. Exhaustively partition the μ space into r +1 discrete cells ω0, ω1, ω2,..., ωr, each rectangular in the position and momentum coordinates and of equal small volume dV = dpdq. These volumes also can be characterized by molecular energy (proportional to velocity). Then to each distribution of particles among the cells in μ, i.e., specifying the number of particles whose molecular state lies within each cell ωi, there corresponds a portion of Γ. Boltzmann now introduced probability through the a priori stipulation that equiprobable distributions (or macrostates) have equal volumes in μ space. Accordingly, the probability of any macrostate represented in Γ is proportional to the volume of the corresponding portion of the μ space. The notion of probability thereby acquires a more abstract character; a transition from probability as relative number to probability as occupied volumes in μ phase space. In his use of Boltzmann’s principle, Einstein in 1904 will exploit an analogous relation between the two spaces to define the probability of a thermodynamic state in an in-principle empirical way: if one observed a system for a very long time, the probability of each state specified by a portion of Γ is defined in terms of the fraction of time spentin this portion in the long run. Thus Einstein shunned defining the probability of a state in terms of counting complexions and the involved a priori stipulation about equal probabilities, and in this way avoided Boltzmann’s implicit assumption of identical but distinguishable particles. So too did Boltzmann in a memoir of 1881 that Einstein presumably did not read.

12The logarithmic function log W increases with W more slowly than any other exponent, including a fractional power like W½.

13Olivier Darrigol has observed that, from a more critical point of view, these conclusions are problematic since in 1877, Boltzmann merely assumed, but did not prove, that his combinatorial probabilities had physical meaning. The proof only came later in 1881. But by then, the method of 1877 had been subordinated to the more powerful ensemble-based approach to statistical mechanics that Boltzmann also pioneered (building on Maxwell). For this reason, the 1877 paper plays little role in Boltzmann’s subsequent writings (see below). Ironically Planck, originally one of the strongest critics of the statistical interpretation of the second law, by adopting the 1877 method in 1900 made the 1877 paper so famous.

14ρ(ν, T) can be defined in terms of the universal function above as J(ν, T) = (c/8π) ρ(ν, T), the constant of proportionality containing c the velocity of light and 8π a conversion factor.

15In 1871 Rayleigh argued that the angle at which sunlight is scattered varies inversely with the fourth power of the wavelength; hence light at short wavelengths (blue end of the visible spectrum) is enhanced much more than light at longer wavelengths.

16More exactly, the relevant number is the number of quadratic terms of the Hamiltonian of the system, which is equal to the number of degrees of freedom if the energy is entirely kinetic in nature. If the gas molecule of mass m has (only) translational motions, there are three degrees of freedom × kinetic energy ½ mv; in this case the total energy is ³⁄² kT. In general, for f quadratic terms in the energy of a molecule, the average total energy is f × ½ kT. For N particles, it is N × f × ½ kT.

17Darrigol, Olivier, “Statistics and Combinatorics in Early Quantum Theory”, Historical Studies in the Physical and Biological Sciences v. 19, no. 1 (1988), pp. 17–80; p. 41.

18Planck’s constant h has the modern value 6.63 x 10-34 joule-seconds (units of energy multiplied by units of time, hence “action”); the so-called “reduced” Planck constant ħ = h/2π was introduced later by Dirac.

19See Kuhn, Thomas, Black-Body Theory and the Quantum Discontinuity, 1894–1912. Chicago: University of Chicago Press, 1978; “Afterword: Revisiting Planck”, pp. 349–70.

20Planck, Max, “The Genesis and Present State of Development of the Quantum Theory”, Nobel Lecture, 2 June 1920, available at www.nobelprize.org/nobel_prizes/physics/laureates/1918/planck-lecture.html

Further reading

Brush, Stephen G. with Ariel Segal, Making 20th Century Science: How Theories Became Knowledge. New York: Oxford University Press, 2015.

Klein, Martin, “Thermodynamics and Quanta in Planck’s Work”, Physics Today, November, 1966, pp. 294–302.

Kuhn, Thomas S., Black-Body Theory and the Quantum Discontinuity, 1894–1912. Chicago: University of Chicago Press, 1978.

Lindley, David, Boltzmann’s Atom: The Great Debate that Launched a Revolution in Physics. New York: The Free Press, 2001.