Open Questions: The Big Bang |
Prerequisites: The standard model
See also: Cosmic inflation – The cosmic microwave background – Dark energy – The cosmological constant
So, what exactly is the big bang theory? One way to answer that is to consider the primary observational fact which suggested the theory in the first place. The fact is that the universe appears to be expanding in a uniform way.
This fact was established by Edwin Hubble in the late 1920s. He did this by comparing two sorts of observations – the "redshifts" in light from distant galaxies as determined by examining the spectra of light from those galaxies, and the distances of those galaxies from us as determined in a way that Hubble pioneered.
Let's consider redshift first. Star light, whether from our Sun or stars in distant galaxies, contains distinct features in its spectrum. In addition to the familiar continuous progression of colors from red to violet, the spectrum contains dark lines at specific wavelenghts, called "absorption lines" because they are the result of specific wavelengths of light absorbed by cooler gas in the star's atmosphere. These lines correspond to energy level transitions in orbital electrons of specific elements found in stellar atmospheres.
We know, from laboratory work, at exactly which wavelenghts such lines should occur. The lines, moreover, occur in a pattern of recognizable groupings. What we observe in light from distant galaxies is that the pattern of all lines we can identify is uniformly shifted to longer wavelengths, i. e. toward the red end of the spectrum. By "uniformly", we mean that every single line is observed at a wavelength which is the same constant multiple of what it should be. For example, sodium has an absorption line at a wavelength of 588.9 nm (nm = nanometers = 10^{-9} meters), in the yellow part of the spectrum. In light from a distant galaxy we might instead observe this line at twice the expected wavelength (1177.8 nm, which is in the infrared part of the spectrum, so no longer visible to the eye). The redshift is defined as a number z which is 1 less than the factor by which wavelength is expanded. In other words,
(observed wavelength) / (emitted wavelength) = z + 1A simple algebraic rearrangement gives the equivalent formula:
z = [(observed wavelength) - (emitted wavelength)]/(emitted wavelength)
z has another nice property for small enough values of z
(z ≤ 100, say, and certainly for z ≤ 8,
which is about the largest redshift of any object that has been
observed). Specifically,
there is a relationship of redshift to the age of the universe when
the light was emitted, given by 1/(z+1)^{3/2}
= t_{e}/t_{now},
where t_{e} is the period of time after the
big bang when the light was emitted and
t_{now} is the present time, about 13.7 billion
years. Hence for z = 8, the light was emitted when the universe
was about one 27^{th} of its present age,
or about 500 million years after the big bang.
On the other hand, although the distance of an object increases with its redshift, the precise relationship between them is not quite so simple unless z is not much greater than 1. |
Redshifts in the spectra of stars and galaxies had been studied since Vesto Slipher began pioneering this area about 1912. By 1925 he had measured the spectral redshifts of about 40 galaxies. The natural assumption as to the cause of these redshifts was that they represented the Doppler effect due to motion away from us of stars and galaxies. For nearby objects, this explanation is entirely correct. In fact, we can deduce that nearby galaxies are rotating by observing that the spectrum of light from one side of the galaxy is redshifted, while the spectrum from the other side is blueshifted (since it is moving towards us). As it turns out, the redshift of distant galaxies is actually due to the expansion of space itself between us and the distant galaxy, rather than relative motion in the usual sense, so the redshift is not quite the same as a Doppler shift.
Now let's consider measuring distances to galaxies. This is the area Hubble was pioneering in the 1920s. Think first of determining how far away a single star is. If you know how bright a particular star is intrinsically and you know its apparent brightness as seen through a telescope, then the distance can be computed because the ratio of intrinsic to apparent brightness is proportional to the square of the distance.
The problem is being able to determine the intrinsic brightness of a given star. There's no good way to do this in general. But there is a certain type of star, called a Cepheid variable, whose brightness fluctuates periodically. The maximum brightness is proportional to the length of the star's period. The distance to nearby Cepheids can be measured by the method of parallax (applying trigonometry to the difference in the star's apparent position when the Earth is at opposite ends of its orbit). This makes it possible to compute the intrinsic brightness of nearby Cepheids.
Hubble was the first to observe a Cepheid in a galaxy other than the Milky Way – the Andromeda Galaxy, M31 – which enabled him to deduce that the galaxy was located well outside our own. This settled a very important open question of the time – the fact that spiral galaxies like M31 are huge aggregates of stars much like our own galaxy, instead of cloudlike "nebulae" inside our galaxy. Hence our galaxy isn't unique, but is only one of many. Hubble went on to find Cepheids in more distant galaxies.
Armed with distance data provided by the Cepheids, Hubble was then able to consider a new question: whether there is any relation between the amount of redshift associated with a galaxy and the distance of the galaxy. Hubble found that there was indeed a simple relationship – which is now known as Hubble's law.
From these observations, Hubble stated the law that is named after him: the velocity of recession of a distant object is proportional to its distance, with constant of proportionality denoted by H. That is:
recession velocity = H × distanceH is often referred to as the "Hubble constant". Actually, H varies with time, as we will find later. And because the speed of light is finite, the farther away from us something is, the earlier is the time in its life that we are seeing it. Therefore, H itself depends (in a complicated way) on the distance of the object from which it is inferred. So it is somewhat more correct to refer to it as the Hubble parameter. But for relatively nearby galaxies (except the very closest ones), and especially all galaxies for which we could obtain accurate distance estimates until very recently, the Hubble parameter is nearly constant, so that there is a linear relationship between recession velocity (inferred from redshift) and distance.
Although other explanations for this redshift are conceivable, the one that eventually seemed compelling, in light of all the evidence, is that it is because the universe is expanding. If the universe is expanding homogeneously (i. e., at the same rate as measured from any point in the universe), the distant objects will appear to be receding from us at a rate that is proportional to the distance. To repeat, the redshift is not due to a traditional Doppler effect, but rather because the expansion of the universe causes the wavelength of each photon of light to be stretched. As noted, the end result is similar to the Doppler effect, though not quite the same. (And incidentally, objects held together by chemical or gravitational forces, such as molecules, planets, and galaxies, are not subject to this expansion, even though photons are.)
Of course, we are not at any special place in the universe. This is what is known as the cosmological principle – the universe looks the same no matter where it is observed from. Everything is receding from everything else according to the same Hubble law.
From this, we can draw a very interesting and rather disquieting conclusion, which leads directly to the "big bang" idea.
The conclusion is that, if we run the evolution of the universe backwards, there should be a time in the distant past when everything we can now see was a heck of a lot closer to everything else. It would have been very crowded at some point in time. While this does not prove that everything was bunched together in a state of very high density at some time long ago, it certainly suggests the possibility. Physicists like George Gamow who thought about such things made this and other inferences which, eventually, led to... the theory of the big bang – the notion that at some point in time about 14 billion years ago (by current reckoning), all matter in the observable universe was in a state of extremely high density and temperature, and subsequently "exploded", resulting in the expansion we still see today. ("Explosion" is used metaphorically. Exactly what did happen is still a major open question.)
How long ago, one can ask, might this event have occurred? Well, notice that if H is the Hubble parameter, then 1/H has the dimensions of time. We can rewrite Hubble's law as
distance = (1/H) × (recession velocity)Though this is just a rough approximation, if the law holds all the way back and if the rate of expansion has remained constant since the beginning, 1/H should be about the amount of time required for the most distant objects we can see in the universe to now be at the distance they appear to be.
For many years there was a serious problem with that interpretation. It turns out that, because of the great difficulty of measuring cosmic distances, Hubble initially underestimated distances by nearly an order of magnitude. Since there was little uncertainty about the redshift, and hence the recession velocity, this made the "age" of the universe about a factor of 10 too small. Like, about a mere 2 or 3 billion years. Other estimates of the age of the solar system and the Earth pointed to an age of 4 or 5 billion years for our home planet. Oops.
Since Hubble's distance estimates were way off, this paradox was only apparent. In all fairness, measuring large cosmic distances, until recently, was pretty difficult. The process is, essentially, to take the actual luminosity of some object and compare it to the observed luminosity. Knowing that luminosity should fall off by the inverse square law, we can compute what the distance must be. However, stars and galaxies come in a wide range of brightnesses. Astronomers must rely on indirect means to estimate the actual luminosity of a particular object. In Hubble's case this involved measuring the periods of Cepheid variables in order to infer their luminosity.
The problem was that there are actually two different types of Cepheids. Hubble assumed a period-luminosity relationship appropriate for one type, but most of the Cepheids observed in distant galaxies were of the other type. Eventually, as these stars became better understood, and telescopes became more powerful, the estimated distances of remote galaxies were increased significantly. But even up until the 1990s, when Type 1a supernovae could be used to judge distances, there was a lot of uncertainty in the value of the Hubble parameter.
It might seem that a theory which held that the universe was in a process of rapid expansion from a dense lump of matter for no obvious reason might be a little hard for astronomers and others to swallow... except for one additional fact. Which is, that Einstein's fairly new, and even more newly verified, theory of general relativity predicted exactly this state of affairs. Even so, it was quite a stretch, which most cosmologists were reluctant to make. The idea of being able to extrapolate features of the whole universe from simple observations and principles seemed almost too good to be true, so for a long time astronomers and cosmologists didn't take their own predictions too seriously. This was true of the prediction of general relativity that the universe was not static, and there have been other important cases of the same thing. (In the case of "cosmic microwave background" radiation, for example.)
Einstein himself realized that his theory predicted the universe should not simply be static. It might be collapsing under the gravitational attraction of everything for everything else, or (for reasons less apparent) expanding. But it shouldn't just sit there, going nowhere, or levitating calmly in midair. Yet in 1915, when the theory was published, and a few years before Hubble came along, just sitting still quietly was what everyone assumed the universe was doing. To make this mathematically plausible, Einstein even added a new term to his equation of general relativity, the famous "cosmological constant". Then, around 1929 when Hubble showed that expansion was a fact, Einstein decided his cosmological constant was just a big mistake, and he scrapped it. (That may have been premature, as will be seen later.)
After Sir Arthur Eddington in 1919 confirmed a key prediction of the general theory of relativity, namely the bending of light by gravity, the overall theory began to be taken pretty seriously, even as unconventional as it was. Various people began coming up with solutions of Einstein's equation. Among them was an obscure Russian meteorologist named Alexander Friedmann. In the early 1920s he published solutions of the equation which implied several forms of an expanding universe, exactly consistent with Hubble's observations. Unfortunately, Friedmann's work was at first ignored, and later disputed, even by Einstein himself. Although Einstein quickly withdrew his objections, theorists remained reluctant to take such seemingly outrageous models entirely seriously.
What interested Gamow was the problem of nucleosynthesis. That is, the question of how nuclei of all the chemical elements came to be formed. This accompanied the question of how energy was generated inside of stars, because it became clear that the process of nuclear fusion – in which heavier elements are built up, ultimately, by protons (hydrogen nuclei) fusing together somehow – could explain the source of stellar energy. Gamow and his students made substantial contributions to both issues.
In particular, they made calculations in the late 1940s of what might come out of a very hot, dense "gas" of neutrons and protons – exactly the form that matter would take at a certain point in a big bang scenario. They succeeded in coming up with reasonable estimates, that only hydrogen and helium would be produced in substantial amounts, and more of the former than the latter. Although these calculations were largely ignored or forgotten for some time by the astrophysical community, they have more recently been much refined, to the point that they produce an excellent agreement with observation – and provide one of the key lines of evidence in support of the big bang model.
Gamow and his students also calculated that the ultrahot, ultradense state of matter in the early universe should have produced radiation which might still be observable today, at an apparent temperature of perhaps several tens of degrees above absolute zero. (These estimates, as well as the ones for nucleosynthesis, were somewhat off, but still remarkable.) Yet these calculations predicting a microwave background radiation were taken even less seriously. Even Gamow himself didn't pursue the idea very vigorously, when it seemed it would be very difficult to observe such background radiation.
Nevertheless, Gamow pushed on to fill out the general scenario of the evolution of the universe according to the big bang model. Throughout the 1950s and early 1960s it was locked in close competition with the "steady state" theory of Fred Hoyle and others as a model for the universe's evolution over time. The steady state theory was equally radical in some respects. For instance, it postulated the ongoing continuous creation of matter out of nothing. But there didn't seem to be enough observational evidence to tip the scales decisively towards either theory.
And then in 1964 Arno Penzias and Robert Wilson discovered, quite by accident and without looking for it, the cosmic microwave background radiation, largely as predicted by Gamow. (Penzias and Wilson got a Nobel Prize for this; Gamow never received the prize.)
The rest is history (which is, basically, what the "steady state" theory became).
Following the table, we will explain in more detail what was going on as the universe expanded and cooled.
Cosmological timeline | |||
---|---|---|---|
Name | Time | Temperature | Notes |
Planck era | 10^{-43} sec. | 10^{32} K | Equality of all forces |
Inflation begins | 10^{-35} sec. | 10^{28} K | GUT symmetry breaks |
Inflation ends | 10^{-32} sec. | 10^{27} K | Strong & electroweak forces are separate |
Electroweak symmetry breaks | 10^{-12} sec. | 10^{15} K | Separation of weak & electromagnetic force |
Baryogenesis | 10^{-6} sec. | 10^{13} K | Quarks confined in hadrons, ending era of quark-gluon plasma |
Quark confinement complete | 10^{-5} sec. | 3×10^{12} K | Only lightest hadrons remain (protons, neutrons); antimatter annihilated |
Nuclear binding | 1 sec. | 10^{10} K | Nuclear binding energy exceeds photon energy; neutrinos decouple |
Nucleosynthesis begins | 300 sec. | 9×10^{8} K | ^{2}H, ^{3}H, ^{3}He, ^{4}He form |
Nucleosynthesis ends | 35 min. | 3×10^{8} K | Mostly H and ^{4}He left |
Matter dominance | 47,000 yrs. | 8000 K | Matter density exceeds radiation density |
Recombination begins | 240,000 yrs. | 3700 K | Electrons begin to bind into atoms of H and ^{4}He |
Last photon scattering | 350,000 yrs. | 3000 K | Universe becomes fully transparent, no matter/photon interactions |
Reionization | ∼10^{8} yrs. | 30 K | First stars |
First galaxies | ∼5×10^{8} yrs. | 10 K | Corresponds to redshift z = 8 |
Present | 13.7×10^{9} yrs. | 2.725 K | Internet, Chinese food, etc. |
One of the basic ideas of the standard model of particle physics is that there are four fundamental forces: gravity, the strong force, the weak force, and the electromagnetic force. However, at the very earliest time in the existence of the universe, all four forces were "unified" and indistinguishable. This is similar to the way we regard electromagnetism as a single "unified" force, even though at one time (before the work of James Clerk Maxwell) physicists regarded electric and magnetic force as distinct.
Contemporary physics is able to say almost nothing about the very first instants of the universe. This is because the strength of the gravitational force at the earliest times was the same as the strengths of the other physical forces. Therefore, both quantum mechanics and general relativity must be applied to describe the physics of this period. But unfortunately, this is not possible, since we do not currently have a workable theory of quantum gravity that would merge these two fundamental theories.
About all physicists can do is make certain educated guesses. As an example from more than 100 years ago, one very inspired insight was due to the German physicist Max Planck. His insight provided the foundations for quantum mechanics, and it also leads to a natural scale for the time and energy level of the universe during the brief instant when all four forces were unified.
Basic to all physical theories are certain fundamental quantities: mass, length, and time. Almost everything in physics can be described in terms of certain combinations of these units. For instance, speed is length divided by time (e. g. meters per second). If we denote these basic quantities with the letters M, L, T, then speed can be represented as L/T. Other physical quantities can be described as products of powers (including fractional and negative powers) of these basic ones.
In 1899 Planck was working on the theory of blackbody radiation, that is, the theory of how very hot objects such as molten rock or metal or the filament of an incandescent light give off visible (as well as invisible) light – or more generally, electromagnetic radiation. The theory was in big trouble, since as it existed, it predicted that the total amount of energy given off by a hot object could be infinite. Planck's insight, for which he won a Nobel Prize in 1918, was that the theory could be salvaged if energy did not occur in continuous increments, but necessarily occurred only in discrete units or quanta. This inspired guess is also the basis on which Planck is considered the inventor of quantum mechanics.
As part of his new theory, Planck introduced a new fundamental constant, which of course is called Planck's constant, denoted by the symbol ℎ. In symbols, Planck's result is that the energy of some amount of monochromatic light can be expressed as
E = n × ℎ × νwhere E is energy, ν is frequency in cycles per second of the monochromatic light, and n is some non-negative integer, which represents the number of energy quanta. Since ν has units of 1/T, and energy has units of M(L/T)^{2}, it follows that ℎ has units of ML^{2}/T.
Note that the actual value of the energy E depends on the units of measure in which the various quantities are expressed. Another brilliant idea Planck had was to use certain well known fundamental constants to derive what he considered to be "natural" units for quantities such as mass, length, time, energy, etc. – all relative to his constant ℎ. Two obvious fundamental constants were the speed of light, c (which has units L/T), and Newton's gravitational constant G (which has units L^{3}/(T^{2}M)). Planck also threw in a factor of 2π (so that frequency could be expressed in radians per second), to make the modified constant ℏ = ℎ/2π, and noted that the expression ℏG/c^{5} has units (ML^{2}/T)(L^{3}/(T^{2}M))(T^{5}/L^{5}) = T^{2}. Therefore the quantity (ℏG/c^{5})^{1/2} has units of time. This unit is called the Planck time unit, and it has a value of about 5.39×10^{-44} seconds.
When he did this, Planck had no idea of what quantum mechanics would eventually become and the important role that his constant ℏ would ultimately play in the theory. Nevertheless, it turned out that the Planck time unit was the smallest quantity of time that was meaningful in the theory. If you multiply that unit by the speed of light (expressed in whatever units were originally used in defining ℏ), the result is the distance light can travel in one Planck time unit. This distance is known as the Planck length. It is about 1.616×10^{-35} meters, and is the smallest unit of length that is meaningful to use – which at about 20 orders of magnitude smaller than the size of a proton is pretty darn small.
There is also a Planck energy, defined as (ℏc^{5}/G)^{1/2}. Expressed in terms of temperature it is about 1.417×10^{32} K. (K stands for degrees Kelvin. This is just the ususal Celsius scale, having 100 degrees between the freezing and boiling points of water, except that it starts at 0 K, unlike Celsius, whose zero point, at which water freezes, is 273.15 K. Thus "absolute zero", or 0 K, is -273.15 C.)
The Planck time is the smallest amount of time that can meaningfully be discussed in quantum mechanics, so it is used to define the Planck era as the first instant after the big bang that we can hope to apply any laws of physics to the universe. That is the era in which all the fundamental forces – gravity, electromagnetism, the weak nuclear force, and the strong nuclear force – were essentially the same. As indicated above, the general theory of relativity cannot actually describe gravity at that time, and we don't know what laws governed the other forces either. All this is part of the unknown, hypothetical "theory of everything".
What is known is that at a certain distance scale an order of magnitude or two longer than the Planck length and hence (by Heisenberg's uncertainty principle) at a similar proportion less than the Planck energy, the gravitational force must have become distinct from, and much weaker than, the other three forces, which remained unified. Once gravitation becomes distinct from the other (still unified) forces, it can be described by laws separate from those of quantum mechanics. At that point, the theory of relativity and quantum mechanics are able to coexist, so physicists have at least the possibility of describing the universe with well-tested theories.
This possibility has not yet been achieved, however. There is still no consistent, unified theory of the strong force, the weak force, and electromagnetism. Physicists have tried very hard to construct such a theory, which is generally called a Grand Unified Theory, or GUT, even though it leaves out gravity. Nevertheless, by extrapolating the way the strong, weak, and electromagnetic forces grow with increasing energy levels, it is possible to predict that all three forces should have equal strength at a certain energy level three or four orders of magnitude less than the Planck scale.
There are various equivalent ways to specify energy levels. One is in terms of temperature. Although "temperature" is a concept from thermodynamics it can be related to energy levels, because a "black body" at a given temperature radiates photons over a spectrum of energies, but the maximum is at one specific energy level. This energy is taken to be representative of the "typical" photon energy emitted by the black body. One common unit of energy is the electron-volt (eV), the amount of energy required to move an electron across a one volt difference of electric potential. (A million electron volts is "MeV", and a billion is "GeV".) A conversion factor called the Boltzmann constant, denoted by k_{B}, relates temperature to the associated energy level. This constant is 8.619×10^{-5} eV per degree K. 10^{-4} eV per degree K is a good approximation, hence 10^{-13} GeV per degree K. For example, the Planck temperature of 1.417×10^{32} K corresponds to the Planck energy of 1.221×10^{28} eV = 1.221×10^{19} GeV.
Anyhow, three or four orders of magnitude below the Planck scale – where extrapolation indicates the strong, weak, and electromagnetic forces have equal strength – is 10^{15} GeV to 10^{16} GeV. That's about as close as physicists can estimate, because without an actual consistent GUT, there's no way to compute the value theoretically. We refer to this energy level as the GUT energy scale.
One might expect that "something" significant should happen as the temperature of the universe falls below this point, as it must, given that the universe is expanding rapidly. For one thing, the strong force separates off from the remaining weak and electromagnetic forces. (The latter two forces remain unified for a relatively much longer time, all the way down to an energy level of about 100 GeV, in fact. This unified force is called the electroweak force.)
Another name for this process is the "collapse of the false vacuum", because although the universe had been in a temporarily stable state, it was not truly a state of minimal energy. What had appeared to be empty vacuum actually contained an enormous amount of latent energy which was released when the temperature dropped below a critical point, much as water releases heat energy when it freezes. (In the opposite direction, to melt ice, it is necessary to add heat.) Further, in both cases a state of relatively high symmetry undergoes an abrupt transition to a state of lower symmetry. In one case, the symmetry between the strong and electroweak forces was broken. In the other case, when water freezes, it goes from being an isotropic substance (in which all directions are equivalent) to one in which there are particular preferred directions: the axes of the ice crystals.
There are some very good reasons why this dramatic phase change followed by exponential inflation is thought to have occurred, even though the physical principles that explain it are still unknown. The idea of inflation was originated by Alan Guth in December 1979. (Anticipating the importance of the idea at the time, he described it as a "spectacular realization".) In 1979, although the big bang theory was widely favored over alternatives because of its success in predicting the existence of a uniform presence of microwave radiation at a blackbody temperature of about 2.725 K, cosmologists recognized that it had several troublesome problems:
How much expansion? It must have been (literally) at an exponential rate, that is, as a function of time, being proportional to 10^{Ht}, where H is a constant – essentially the Hubble parameter at that instant. The process can be thought of as a certain number N of cycles in each of which the universe expands by a factor of 10, so the total expansion would be a factor of 10^{N}. If the whole period of inflation lasted Δt seconds, then N = HΔt. If N = 100, N cycles would have inflated the universe by a factor of 10^{100}. Suppose inflation began at 10^{-35} seconds after the big bang, and each cycle was that length of time. Then Δt = 10^{-33} seconds, so that is about when the inflationary expansion would have ended. The Hubble parameter during this period would have been H = N/Δt = 10^{35} (sec)^{-1}.
Although we can only guess at what the actual numbers might have been, an expansion factor of 10^{100} would certainly have solved all of the problems listed above:
We will discuss inflation in much more detail elsewhere, and consider what sort of processes could have been responsible for such an immense expansion of space. For present purposes, we just want to note an important byproduct of this drastic event. At the GUT scale, just before the symmetry was broken, there was no essential distinction between one class of particles known as quarks, and another class, known as leptons. Quarks are the particles which combine in threesomes to make nucleons (protons and neutrons). Leptons are particles such as electrons and neutrinos. Each of these particle types comes in three distinct subtypes known whimsically as "flavors". The flavors are somewhat mutable, under the action of the weak force, but in our universe as it is presently, there is no way that a quark can turn into a lepton, or vice versa.
But this wasn't the case at the GUT scale. Since all the forces (except gravity) were equivalent, there were particle reactions which turned leptons into quarks and quarks into leptons. Fundamental forces are said to be "mediated" by a special kind of particle known as a boson. The effect of any force on a particle like a quark or a lepton is caused by an interaction between the particles in which the boson that mediates the force is exchanged. For example, the boson which mediates (or "carries") the electromagnetic force is simply the quantum of the electomagnetic field, namely the photon. The force of attraction (or repulsion) between charged particles of opposite (or the same) electric charge can be understood as an interaction between the particles and photons.
Similarly, at the GUT scale, before the GUT symmetry was broken, it is hypothesized that there were very massive bosons, which are usually known generically as X particles. When a quark interacts with an X it becomes a lepton, and when a lepton interacts with an X it becomes a quark. X particles must have been very massive, having mass-energy close to the GUT energy level of perhaps 10^{15} GeV. (Technically, since E = mc^{2}, the proper units for expressing mass are energy units divided by the square of the speed of light. So to be correct, masses should be expressed in terms of eV/c^{2}, GeV/c^{2}, and so forth. For simplicity, we will generally omit the c^{2}.)
Another way to think of the breakdown of GUT symmetry is to realize that when the universe cooled below a temperature equivalent to 10^{15} GeV, there would quickly cease to be any more X bosons around. This is for the simple reason that they would be unstable, and they could no longer be created by the pair production process out of photons (or any other particle), simply because there were no photons or other particles with a sufficiently large amount of energy. Since all X bosons would quickly disappear, there was eventually no way for quarks and leptons to change into each other. And thus the GUT symmetry was lost.
There are in fact a number of theoretical ideas about how inflation actually worked, but all produce roughly the same result. We will discuss some of them elsewhere. It is worth noting, however, that in many cases there will be subtle differences in the results of inflation that can potentially be used to discriminate between different possible inflationary mechanisms. One of the differences between different inflationary theories is in the mechanism that brings inflation to an end and in how the universe is reheated to almost the temperature it had before inflation began, in spite of the vast amount of expansion.
Although theories differ as to what ended inflation, it clearly did end. The result at the end of inflation is a universe which is potentially much easier to understand from current knowledge of particle physics. This is simply because the strong force is no longer unified with the electroweak force, so that physicists can work with two different forces (strong and electroweak). And this, in turn, is actually an advantage, because these two forces are somewhat understood. Since each of these forces is within the range of existing accelerators, actual experiments can be conducted that guide the development of the theory and allow physicists to separate fact from conjecture.
As exotic as it may be, the post-inflation era of the universe is a much more familiar environment than it was before inflation, though some very notable mysteries remain.
For reasons we will explain shortly, there are no ways to produce new particles that have more mass-energy than is available given the prevailing temperature of the universe at a particular time. So, given that most particles ultimately decay, the diversity of particle species continually decreases. Although a vast menagerie of particles may have remained after inflation ended, we have little way to conjecture what most might be. For most purposes, at least as far as cosmology is concerned, such particles may as well have never existed.
However, there are a few apparent facts about the universe as we know it today that we have no good ideas how to explain. These facts may be effects left by particles that once existed but are no more, or that, at least, do not any longer interact significantly with the "ordinary" matter that makes up galaxies and stars and planets. The list of such mysterious facts includes:
As already mentioned, this model consists of fundamental forces, a variety of different types of elementary particles, and a mathematical description of the behavior of those forces and particles. The forces, again, are gravity, the strong force, the weak force, and the electromagnetic force. At the very highest energy levels we can conceive of – at the Planck scale – all of the forces are "unified", which means they all have the same strength and obey the same equations. As the temperature of the universe decreases along with its expansion, this high state of symmetry is broken as the characteristics of the forces become more dissimilar. Gravity separates from the other three forces even before inflation begins. The subsequent separation of the strong force is thought to be associated with the phenomenon of inflation.
So after inflation is done, there are three distinct forces: gravity, the strong force, and the electroweak force – which comprises the still undifferentiated weak force and electromagnetic force. There are also a wide variety of elementary particles present. We'll say more about them in a moment, but they include photons, electrons, quarks, neutrinos, and far more exotic and massive particles as well.
We can associate an energy level with the universe at this time. There are several ways to express this energy level, as explained above. The temperature of the universe at the end of inflation is estimated to be about 10^{27} K. Using Boltzmann's constant, this can also be expressed as 10^{14} GeV.
However expressed, that is still a huge amount. For comparison, the rest mass of a proton is .938 GeV – 14 orders of magnitude smaller. Even the heaviest particle that can be produced in present accelerators, the top quark, has a mass of only about 178 GeV. The importance of these energy levels is that we don't have the experimental tools to even come close to being able to study directly what physics is like at the time just after inflation ended. The gap is enormous. Not too surprisingly, physicists can't say very well what might actually going on at such energy levels.
Lacking experimental evidence, it's tempting to suppose there isn't a lot going on that really matters. For example, the electroweak force does not split into a separate weak force and electromagnetic force until the level of about 100 GeV, which is well within the energy range of present accelerators. And according to the still untested idea of supersymmetry, the lightest "supersymmetric" particles should weigh in under 1000 GeV (possibly a lot less). So there is an energy range of about 11 orders of magnitude above 1000 GeV which we know very little about, except that it seems to have relatively little effect on the physics we can actually study experimentally. Particle physicists sometimes call this energy range the "desert", since it doesn't seem to be all that interesting.
However, it's probably rather cavalier to suppose little important is going on at energy ranges above 1000 GeV, even though we can't presently study that range directly. And yet it's quite true that in the range of energies below 1000 GeV (i. e., 1 TeV) where experimentation is currently possible, the standard model has an extremely good fit with experimental results. It has proven extremely difficult to find physical effects which require "new physics" at high energies to explain.
Although we cannot presently perform experiments at very high energies – and probably won't ever be able to at the GUT scale – cosmology itself should be able to give us clues about what goes on, if we could only recognize and understand them. For example, while we can postulate many possible mechanisms by which inflation may have occurred, without understanding physics at the GUT scale, we can't determine what actual mechanism may have been involved. If inflation is in fact a valid hypothesis and if we can ever learn more about its effects, we may be able to place limits on what GUT-scale physics must be like.
Similarly, the other "mysteries" listed above certainly have their explanations rooted in physics that occurs above 1 TeV (trillion electron volts). For example:
A little later, when we come to discuss nucleosynthesis – the process in which nuclei of elements such as helium and heavy hydrogen (deuterium) are formed – we will see a concrete example of how applying well-understood physical laws to conditions in the universe within a few minutes after the big bang make very testable predictions. The fact that these predictions have been confirmed is solid evidence for the big bang model, since the model tells us what the conditions of temperature and matter density must have been.
But that kind of reasoning can be turned around. If physicists come up with sufficiently detailed theories for things like supersymmetry and CP symmetry breaking that make predictions about particle interactions, then we can use the big bang model at higher energy levels to predict observable effects – such as the ratios of baryons to photons and baryonic to nonbaryonic matter, which have both already been measured.
There is, thus, an interplay between cosmology and very high energy physics. The early universe is the ultimate particle accelerator. Observational facts from cosmology put useful constraints on the physics. Conversely, anything we are able to learn about the physics of CP symmetry breaking or supersymmetry (for example) might suggest new phenomena to look for in cosmology.
One key connection between cosmology and particle physics has to do with simply counting numbers of particles of different types. Suppose p denotes any particle (not just a proton, as it usually does), and p′ denotes its antiparticle. Everyone knows that when a particle and its antiparticle interact, they annihilate each other and produce massless particles of energy – photons. This reaction can be written:
p + p′ ⇄ γ + γHere γ is a photon. (Two photons are always produced, in order to conserve both energy and momentum.) The reaction can also proceed in the opposite direction, as indicated by the bidirectional arrows. This reverse reaction is called pair production, and it means that any type of particle, together with its antiparticle, can be created out of the interaction of two photons, provided only that the photons have enough energy.
And therein lies the rub. As long as most photons have at least as much energy as the rest mass energy of a particular type of particle, the above reaction will go on constantly in both directions. Pairs of any type of particle will ceaselessly be produced out of photons, and will just as readily annihilate each other to give back photons. There's nothing to limit the reaction in the left to right direction, since photons can exist with any amount of energy however large or small.
However, the same is not true in the opposite direction. There is no such thing as a proton with an energy less than the proton rest mass of 938.3 MeV, though a proton can have an arbitrarily larger amount of energy in the form of kinetic energy. Therefore, as soon as the universe has cooled to the point where there are almost no photons left having that much energy, essentially no new protons will be created. Ever. (Well, hardly ever. Virtual particles of any kind can be created for very brief periods of time by virtue of the Heisenberg uncertainty principle. Virtual particles can even affect the "real" world, if they happen to interact with something else during their brief existence. A virtual X particle, for example, could possibly cause a proton to decay. But because the X is so massive its lifetime must be extremely short. Therefore, the probability of causing a proton to decay is extremely small, and the expected lifetime of a proton is extremely long.) Likewise, essentially no additional particles of any kind will be created once the universe has cooled beyond the point that essentially all photons have less energy than the rest mass of the particle. This includes neutrons (which have a rest mass of about 939.6 MeV), and that is crucial for the theory of nucleosynthesis, to be discussed later.
Of course, the same thing applies to heavier particles as well. Quarks are somewhat of a more complicated case, since below a certain energy level quarks apparently cannot exist except in bound states of two quarks (mesons) or three quarks (baryons). But all hadrons (either mesons or baryons) other than protons are unstable, and eventually decay spontaneously. Therefore, after they can no longer be created in the pair production process because photons of sufficient energy are not available, they will disappear from the universe. (Unless created in very unusual circumstances such as supernova events or in the vicinity of a black hole.)
This is relevant to one curious fact – the existence of any ordinary (baryonic) matter at all. The term baryogenesis refers to the sequence of events that leads to the residual baryonic matter. As previously noted, all different types of particles should exist in approximately equal numbers, at some sufficiently high temperature. This is simply because, at high enough energies, all particles, including photons, should be able to convert into each other, at least indirectly. For instance, a quark and an antiquark annihilate with each other, producing two photons. So quark plus antiquark → two photons. The two photons, in turn, can give rise (for example) to a muon plus antimuon. So pair production suggests that two photons should exist for each particle-antiparticle pair. (Photons are their own antiparticles, but you can arbitrarily count half of them as antiparticles.)
When the universe becomes sufficiently cool (at a different temperature for each particle type), it is no longer possible to produce any more new particles of each type by pair production. (As mentioned, particle-antiparticle pairs can still be created as virtual particles, but that's a different process, and the particles exist for only very brief instants.) After that point, all particles and antiparticles which can annihilate with suitable partners of the same kind will do so. If the number of particles and antiparticles is exactly the same, then either equal numbers of particles and corresponding antiparticles remain, or else neither remains at all. But many observations indicate that antimatter does not exist at all in significant amounts. Yet ordinary matter clearly does.
This is a problem, so the assumption of exactly equal numbers of particles and corresponding antiparticles must be wrong. Instead, there must have been slightly more quarks and leptons than antiquarks and antileptons. But why? Pair production and annihilation can't produce an imbalance between particles and antiparticles. So the excess must due to unknown processes other than pair production. And such other processes must exhibit some symmetry breaking, however slight.
Quarks and leptons must have existed from the earliest times when different particle types became distinguishable, after the breaking of GUT symmetry. They would remain in equilibrium with photons until the temperature level of the universe dropped to a certain point. For leptons, what happens is simple. As soon as the universe is cool enough that no new lepton-antilepton pairs of a given type are created by pair production, the reverse reaction proceeds until all antileptons are gone.
With quarks, however, the situation is a little more complicated. The two lightest quarks are called "up" and "down". They have rest masses of about 3 MeV and 6 MeV, respectively. A proton consists of two up quarks and one down quark, while a neutron consists of two down quarks and one up quark. It turns out that the strong force does not allow free quarks to exist below a certain energy level, somewhere around 1000 MeV. (Which happens to be about the rest mass of protons and neutrons.) So, for example, all up and down quarks become incorporated into protons and neutrons at that level. (The substantial excess rest mass of protons and neutrons over that of the constituent quarks is due to "binding energy".) This phenomenon is known as quark confinement.
Thus around the 1 GeV energy level all quarks become bound into either pairs of quarks (mesons) or groups of three (baryons). And therefore, long before the lightest quarks cease to be created by pair production, they become confined. But the same principles apply to the resulting hadrons, and there are simply too few photons with sufficient energy to create many baryons such as protons and neutrons by pair production after the baryons appear as a result of quark confinement.
In any case, all particles and antiparticles of the same type that can pair up annihilate with each other, yielding photons. Only a small unpaired excess of matter particles remains in the form of baryons. From calculations of the amount of deuterium that should be produced by nucleosynthesis, there must have been about one baryon per 1.8 billion photons in order to account for the amount of deuterium that is actually observed.
We are going to discuss the process of nucleosynthesis in much more detail later, because the big bang model successfully predicts certain key observational facts about the relative abundances of several very light nuclei. And this is therefore one of the strongest pieces of evidence in favor of the model. So we'll provide just a brief outline at this point.
As mentioned, stable atomic nuclei (other than hydrogen, a single proton) cannot exist at energies substantially higher than 1 MeV. But even when the temperature falls into this range, the probability that a nucleus will actually form depends on how fast a few critical reactions proceed. And these reaction rates depend on factors such as the temperature, the rate of expansion of the universe, the densities of baryons and photons, and the ratio of existing protons to neutrons.
The situation is further complicated by the fact that neutrons which are not bound into nuclei are unstable, with a half-life of 614 seconds. Once substantially all uncombined neutrons have decayed (into a proton plus an electron), nucleosynthesis stops. (In principle, two helium-4 nuclei, for example, could fuse to form beryllium-8, but such reactions are unlikely, mainly due to the electrostatic repulsion between positively charged nuclei.) Consequently, there is a race against the clock. All nuclei heavier than hydrogen that are going to form at this stage of the universe must form within the first half hour or so. Additional nuclei (from beryllium through iron in atomic number) form much later in stellar interiors. And still heavier nuclei can form only in supernova events.
Yet another impediment to nucleosynthesis is that most reactions add only one neutron at a time. And the first step is the hardest, because one proton plus one neutron yields deuterium. Unfortunately, deuterium is only barely stable and has a very small binding energy, about 1 MeV per nucleon. Consequently, deuterium cannot exist in significant amounts until the temperature drops by a factor of 10 from 10^{10} K (1 MeV). At this era, the energy of radiation (photons) governs the expansion of the universe and hence the rate that temperature decreases. Specifically, temperature is proportional to the square root of time (we'll show later why that is), so deuterium production can't begin until the universe is at least 100 seconds old. General nucleosynthesis doesn't really get going until more like 300 seconds, when there are no longer enough energetic photons to destroy deuterium, and enough deuterium nuclei have formed to combine with additional neutrons to make helium-3, helium-4, and lithium-7.
That, in a nutshell, is nucleosynthesis. Since the nuclear physics required to compute reaction rates is well understood, it is in principle simple to compute the relative abundances of different nuclei that result from this process. Computationally, it is rather more complex, since there are a number of interrelated factors involved. It's fairly easy to compute that most of the resulting nuclei are either hydrogen (protons) or helium-4. And further, given the ratio of protons to neutrons (which is about 8 to 1 when the process is well underway), the final result is that helium-4 makes up about 25% of all matter, by weight, and hydrogen makes up most of the rest. Everything else is only a trace. This prediction is verified by observation, which is important evidence in favor of the big bang model.
By taking into account how other factors affect other relative abundances, it is possible to infer certain other important facts, such as:
We will explain the differing effects in much more detail in the next major section. But basically what happens is that as the universe expands, the density of energy in the form of radiation (photons) decreases faster than the density of energy in the form of matter. The reason for this is that the density of particles of matter decreases as the cube of the length scale of the universe, since the number of particles remains about the same (after the annihilation of antimatter), while the volume increases as the cube the linear dimension. The number density of photons also decreases as the cube of the length scale, but in addition the wavelength of every photon is also stretched by the expansion of the universe, and this decreases the energy of each photon by an additional factor of length, so that altogether the energy density of radiation decreases as the fourth power of length.
There was much more energy density in the form of radiation to start out with. But at approximately 47,000 years after the big bang, at a temperature of about 8000 K, the energy densities of matter and radiation became equal. After that point, the energy density of matter was higher. At that point we say that the universe went from a state of energy dominance to one of matter dominance. This change did not have any abrupt consequences. All that happened was that the universe began to expand more rapidly. More precisely, the length scale of the universe transitioned smoothly from increasing in proportion to the square root (one-half power) of the time to the two-thirds power of time. The temperature of the universe began to decrease more rapidly at this point, in a similar way.
This increase in the rate of expansion and temperature decline is important to keep track of for purposes of calculating how length scale and temperature vary. Apart from that, the universe wasn't greatly affected by the change from radiation dominance to matter dominance. Photons and matter continued to interact strongly with each other.
The next important transition, however, has much more important observational consequences. It is what gives rise to the cosmic microwave background which, if our eyes were sensitive enough to microwaves, would give the night sky a very distinctive pattern. This effect is the direct result of a change that occurred in how photons interact with matter when the temperature of the universe was in a specific range.
Although nuclei of a few lightweight elements were formed in the nucleosynthesis process, they were not yet atoms as we know them, since they were fully ionized – electrons still moved about freely instead of being bound to a nucleus. Since both nuclei and electrons carry electric charge, photons (which are carriers of electromagnetic force) interacted strongly with them. As a result, the still extrememly hot plasma was opaque to light. No telescope, however powerful, could see anything in the universe that occurred from the beginning to more than 300,000 years after the big bang.
Indeed, the next major event occurred when electrons finally began to bind with nuclei. This is called the recombination era, even though it was in fact the first time that nuclei and electrons combined. The process of recombination began only when the temperature dropped far enough that a "typical" photon no longer had enough energy to break the electrostatic attraction (also known as Coulomb force) between an electron and a nucleus. This process happened gradually, because not all photons had the same energy. They were distributed in a way characteristic of blackbody radiation, where the energy range of photons is spread out on either side of a peak value. The range is especially large on the high end, so there are always a few very energetic photons around even after the "typical" photon has an energy close to the peak of the curve.
The recombination process began around 240,000 years after the big bang, when the temperature had dropped to very low levels, relatively speaking – about 3700 K. By comparison, the surface of the Sun has a temperature of about 6000 K. The peak of the spectrum of a blackbody radiator at 6000 K is at roughly 500 nm (nanometers) – photons with that wavelength correspond to yellow light, the color of the Sun. At 3700 K the blackbody spectrum is flatter, but has a peak around 700 nm, on the boundary between red and infrared. Remember, though, that there would still be many photons with much higher energies, so the probability was still high that an electron bound to a nucleus would interact with a photon.
3700 K is the temperature at which half of the nuclei should have a full complement of electrons, while the other half was fully or partially ionized. A photon could still interact with an electron bound to a nucleus without stripping the electron from the nucleus; it might simply raise the energy level of the electron rather than ejecting it from the atom.
The universe continued to expand, so the density of photons and matter kept dropping. The simultaneous decrease of both density and temperature made it increasingly less likely that a photon would interact with matter. The probability of interaction can be expressed in terms of the mean free path of a photon, which is the expected distance that a photon can travel before interacting with matter. When this mean free path became longer than the Hubble length (defined as the speed of light divided by the Hubble parameter at the time – which is the distance light could travel in the approximate age of the universe at the time), the probability of interaction became essentially nil. The time that this occurred is called the time of photon decoupling. That was about 350,000 years after the big bang, when the temperature was about 3000 K.
At the time of photon decoupling, the universe first became transparent, for all practical purposes – light was essentially no longer scattered by matter. This is the origin of cosmic background radiation, also known as the cosmic microwave background (CMB). (The "microwave" here is misleading – that is the part of the spectrum in which the radiation is observed today, over 13 billion years later. At the time of photon decoupling, the radiation was mostly in the near infrared.)
Of course, the probability of a photon interacting with matter is never quite zero. Photons from the CMB are still interacting with matter today, such as the matter in one of our microwave antennas. That is how we are able to "see" the CMB. The fact that we can actually "see" the CMB is one of two important features about it. It is important, because the existence of the CMB, at an equivalent temperature of about 2.725 K, is a very strong piece of evidence supporting the big bang model. The model predicts its existence, so we must see it (as we do) to confirm the prediction. It is also a feature which is very implausible to be found in other proposed cosmological theories, such as the "steady state" theory.
The other reason that the CMB is so important is that it carries a wealth of information about the universe at the time of decoupling, and much earlier as well. Some of what we have learned from very careful measurements of the CMB includes:
We have an extensive discussion of the CMB elsewhere, so that should be enough about it for now.
The period after photon decoupling is called, simply, the period of the "early universe". The universe was a very boring place then. Light could move freely, but (at first) there was nothing to see – no stars, no galaxies. The period is sometimes also called the "dark age" – partly because the only light source was the CMB, which was actually infrared at the time, hence not all that bright, and partly because we know very little about the period.
It can be calculated that the existing hydrogen and helium gas should have been able to condense into the first generation of stars within 100 million years after the big bang, perhaps even earlier. It can also be calculated that the very first stars would have been rather different from typical stars today. They would have been much more massive. But like very massive stars today, they would have converted matter to energy in thermonuclear reactions quite rapidly, and so they would have been very bright, with very high surface temperatures. The stars would have been so hot that the photons they emitted would be energetic enough to reionize lots of interstellar gas, producing glowing gas clouds such as still exist around very hot stars. Accordingly, this period is known as the period of reionization.
Still later, maybe around 500 million years after the big bang, stars began to cluster together in galaxies. This could not have occurred so early without the presence of vast amounts of nonbaryonic dark matter. It is supposed that there were regions of relatively higher and lower densities of such dark matter. The regions of higher density were massive enough to cause existing (baryonic matter) stars and gas to fall into them – making galaxies. Some of these very first galaxies may actually have been imaged by the Hubble space telescope, at a redshift of 8 or more.
We will discuss elsewhere in more detail the processes that went on in the early universe to produce stars and galaxies, so this is an appropriate time to bring our overview of the first few hundred million years of the universe to a close.
In order to give a more detailed explanation of how the universe evolves under the big bang scenario, we need to look at the general features of any cosmological model. And to do that, we need to say a few words about the use of "models" in science.
A model in science is really quite simple. It typically consists of a set of concepts that either implicitly or explicitly refer to some aspect of the "real world". In addition to the concepts, a model explicitly contains some assumptions about properties of the real world things to which the concepts refer. Generally, these assumptions make some sort of intuitive sense, or may even seem fairly self-evidently true – but not necessarily. A good case in point would be Einstein's special theory of relativity. Two of the key assumptions there are that the speed of light (in a vacuum) is always the same and that nothing, neither physical objects nor information, can travel faster than the speed of light. The first assumption seems reasonable enough, though it certainly requires experimental confirmation. The second assumption is not so intuitively obvious, and it certainly isn't easy to imagine how it could be definitely proven by experiment. Nevertheless, the theory actually works very well: it makes testable predictions, and its predictions have survived all experimental tests – so far.
What's the difference between a "model" and a "theory", then? Nothing, really, other than semantics. The term "theory" carries with it various differing connotations. Sometimes the term is used as in "It's only a theory", implying that the theory has not been rigorously confirmed, and may in fact be somewhat doubtful. However, nothing in science has in fact been absolutely confirmed beyond any possibility of doubt. Science doesn't claim to be able to do that. It claims only, at best, to be able to confirm some theories very, very well, beyond any "reasonable" doubt. Quite a few "theories" are actually in this category, such as Newton's "theory" of gravity (at least as an approximation – some of its details are known to be not 100% accurate), and the special theory of relativity.
With a model, however, one doesn't worry excessively whether all of its assumptions seem intuitively true or can even be confirmed directly. The only thing that really matters is that the model has testable predictions, and that its predictions, when tested, range from "pretty accurate" to "correct up to the limits of measurement". The "theory of quantum electrodynamics", for example, which describes the behavior of light and electric charge, is often cited as one of the most exact theories in science, and its predictions have been confirmed to more than 10 decimal places.
But many times, a scientific model may be far less accurate and definitive. Physicists, for instance, often make various simplifying assumptions about a situation in order to make numerical predictions feasible. From this arise such known impossibilities as a "perfect gas" or a "frictionless plane". Part of the folk humor of the physics community is the story of the physicist who, when asked by a dairy farmer for advice, began by saying, "Consider a spherical cow..."
Often a model will contain many somewhat arbitrary adjustable parameters and arbitrary mathematical equations that relate parts of the model. Typically, the equations can't be solved exactly, and the whole model is programmed into computer code. You run the code, and if the results are a good match with details that can be measured empirically, the model is considered a success. The better the match, and the wider and more general the set of circumstances covered, the more successful the model is considered to be. There are a vast number of such models used in science today, from the "standard model" of particle physics to weather and climate models.
But this doesn't really have anything to do with computers and certainly isn't new in science. It's actually just an instance of what used to be called "inductive reasoning". If you make a series of observations and consequently are able to formulate a set of rules which accurately describe the results, then you have a "theory" or a "model" or whatever you want to call it. For example, between 1609 and 1618 Johannes Kepler enunciated three "laws of planetary motion". These laws accurately described mathematically the motion of planets in the solar system. Kepler was an astronomer (actually, an astrologer) who began as an assistant to Tycho Brahe. He devised his laws inductively, based on the extensive observations made by Tycho and himself. The result was quite a good model of planetary motion. Kepler could not explain or justify why his laws worked. It was good enough that they did. But of course, about 50 years later, Isaac Newton came along and demonstrated mathematically why Kepler's laws worked – provided you accepted Newton's theory of gravity and his more general "laws of motion". Newton's theory is considered to be "deductive", even though his own laws of motion themselves are essentially just plausible assumptions which are verifiable and which, when analyzed mathematically, imply Kepler's empirical laws among their predictions.
Newton's theory of gravity itself was eventually superceded by Einstein's general theory of relativity. And this theory, like Newton's before it, starts by making various sweeping but plausible assumptions. One of these was the special theory of relativity. Others concerned the mathematical form that a theory of gravity should have. But what eventually resulted was not only a theory which generalized Newton's theory, but which was also able to make correct predictions where Newton's theory failed (such as the orbit of Mercury), and to make entirely new and surprising predictions, such as the bending of light by matter.
The main assumption that is made in the model is what is known as the cosmological principle. This says that the universe looks, approximately, the same from any vantage point. More specifically, the assumption is, first, that the universe is homogeneous – essentially the same everywhere. And second, the universe is isotropic – it looks essentially the same in any direction. We hedge a little, with words like "approximately" and "essentially", because clearly on a small scale the assumptions are not true. Our part of the universe, near a medium size star in a rather average spiral galaxy, is certainly not like the interior of a black hole or the empty space between galaxies. However, the assumption is that on very large scales the universe has no preferred location and no preferred direction.
The two parts of the assumption are not redundant. The universe could be isotropic when viewed from one location without being homogeneous. For instance, it could be much denser (on a large scale) at some distances than others. And it could be homogeneous without being isotropic if, for example, all galaxies were lined up parallel to each other. But it is true that if the universe is isotropic everywhere, it must in fact be homogeneous. These two conditions amount to the universe having certain types of symmetry: everything about the universe is symmetric (at large scales) under the symmetry operations of spatial translation and rotation.
One consequence of the homogeneity condition is that the big bang event (if such existed, which isn't in fact absolutely required in all versions of the model) didn't happen at one particular point in space. It must have happened "everywhere". And although the universe certainly appears to be expanding away from us in all directions, the same must be true from all other vantage points as well. This is the familiar analogy to spots on an inflating balloon.
Note that the universe is not assumed to have such symmetry with respect to time, even though spacetime is still assumed to be 4-dimensional just as described in special relativity. The symmetries pertain strictly to the 3 spatial dimensions. It is in fact definitely assumed that the universe must have been very different at times sufficiently far back in the past, as well as (probably) far in the future. The so-called "steady state" cosmological model which assumes time symmetry – that the universe has always looked about the same and always will – seems to conflict in a number of ways with what we actually see around us. For instance, there are several compelling reasons to believe that the universe must have been much hotter and denser in the past than it is now.
This brings us to another assumption: that the universe is not spatially static but is in fact expanding, in a certain sense that we will make more precise. As we noted above, this assumption is founded on the observations first made by Edwin Hubble of galactic redshifts, and which, since his time, have been confirmed in many ways. Yet in some sense, the assumption of an expanding universe remains "just" an assumption. There are still astronomers and cosmologists today who seek alternative explanations for the observed redshifts. If one of these alternatives were correct, it might not be necessary to assume the universe is expanding, as it appears to be. However, if the universe is not in fact expanding, then a variety of other things we can observe, such as the cosmic microwave background (CMB) have no obvious explanation. So we make the assumption of expansion not only because that's the simplest explanation of Hubble's observations, but also because it yields a consistent model that explains many other things too – such as the apparent age of the oldest stars, the chemical composition of the universe, and the CMB.
It's also worth noting that Einstein himself, shortly after developing the general theory of relativity, assumed the universe was not expanding, since before Hubble almost everyone assumed the universe was static. Einstein was in fact much annoyed because his theory strongly implied that the universe shouldn't be static, and ought to be either expanding or contracting. After all, an object thrown into the air from the surface of the Earth doesn't just hang in midair. At first it rises, and then it falls back under the force of gravity. It would be very surprising if the universe, under the force of gravity, didn't behave similarly. Einstein even went so far as to modify his theory to allow for the universe to be static by adding the cosmological constant to it. As it turned out, after Hubble proved the universe wasn't static, it seemed that adding the cosmological constant was a mistake. (Though it has returned, with a vengeance, as a result of fairly recent observations, as we will explain at some length.)
The mathematical way to describe space and curvature is a subject known as Riemannian geometry, after Berhnard Riemann (1826-66), who developed the necessary mathematics about 50 years before Einstein availed himself of it. In modern terminology, the theory is all about objects known as Riemannian manifolds. The key feature of such an object is that it possesses something known as a metric, which provides a way of measuring distances within the manifold, independently of any external geometrical constructs, such as a global Cartesian coordinate system. For example, it is possible to describe distances and all other geometric aspects of the 2-dimensional surface of a sphere (such as the surface of the Earth), without referring to how the surface is embedded in 3-dimensional space. Likewise, the 4-dimensional spacetime that figures in the special and general theories of relativity can be described completely in terms of a metric, and things like curvature can be defined rigorously, even though it is not easy to visualize just how a 4-dimensional space is able to curve.
From our assumption that the universe is homogeneous, we can deduce that the same sort of local coordinate system can be used everywhere. Further, the metric used to define distances much have the same form everywhere. This isn't to say that one single coordinate system applies everywhere, only the same sort of coordinate system.
We can make this more concrete. Imagine any point within the universe, such as the current location of the Earth. You can make this the origin of a spherical coordinate system, involving radial distance and two angular coordinates – essentially radial distance from the center of the Earth, plus latitude and longitude. (This is only a way to describe 3-dimensional space. For the moment, don't worry about the time dimension of spacetime.)
By virtue of the assumed homogeneity of the universe, it simply does not matter where we place the center of the coordinate system. It could be here on Earth, or it could be somewhere a few billion light-years away. Any description of the universe should be essentially the same no matter where we place the center.
There is one special type of coordinate system which is particularly useful. It is called a comoving coordinate system. Still assuming a spherical type of coordinate system as well, given some arbitrary point as a center, we can identify any other point in the universe by a 3-tuple of numbers: (x, θ, &phi). The quantity x is still the distance from the center to the other given point, and θ, φ are the two angular coordinates. The thing about a comoving coordinate system is that, with both the center and the other point fixed, the 3-tuple of coordinates never changes as a result of deformations (such as expansion and contraction) of space itself. You could imprint the coordinate system onto space, and the numbers would never change. Of course, an actual particle momentarily located at a particular point can move around within space, but the points through which the particle moves keep the same coordinates no matter what space does. That is, x, θ, and φ are constants.
However, we are still assuming that space actually is expanding (or maybe contracting). Therefore, the "real" distance r (sometimes called proper distance) from the origin to the selected point will change as space expands or contracts. r = r(t) is actually a function of time alone. By virtue of our assumption that the universe is isotropic (on a large scale), the "real" angles θ and φ of the selected point never change. This means that we assume the universe isn't expanding at different rates in different directions that we might look. (r(t), θ, φ) is the time-dependent vector that defines the real location of the selected point as time passes. It follows from our assumptions of homogeneity and isotropy that r(t) = a(t)x for some function a(t) that doesn't depend on the origin of the coordinate system, x, θ, or φ – only on time.
Note that a(t) is a ratio of distances, and so it is a dimensionless quantity. However, we are free to pick the time at which the proper distance is measured. It's only natural to pick this to be the present time, which means that right now real distance = comoving distance, and so a(t_{now}) = 1. This fixes a(t) completely (up to uncertainty of measurement).
a(t) is the most important mathematical quantity in the big bang model. It is called the scale factor. a(t) describes how space is expanding or contracting everywhere (by virtue of homogeneity) solely as a function of time, t, measured since the instant of the big bang. For some values of t, space may be expanding (a(t) is increasing) and at other values space may be contracting (a(t) is decreasing). If we knew a(t), then we'd know, in principle, exactly what space is doing at any point in time. Other interesting quantities one might care about, such as the rate of expansion (i. e., the Hubble "constant"), the density of matter and energy, etc., can be related to a(t).
A simple way to think of the scale factor a(t) is that it is a kind of ruler one may use to measure distances in the universe. This ruler has the useful property that, regardless of when you measure the distance between two particular distantly-separated objects, like galaxies, the distance never changes. (The objects should be far enough apart that actual relative motions are small enough to be negligible in comparison to the expansion of space.) So this ruler measures comoving distance. But it is a peculiar sort of ruler, which itself expands (or contracts) just as space does.
Since the scale factor measures the expansion of space, it will affect the wavelength of all photons. Therefore it must be related to redshift z, which was defined so that z + 1 is the ratio of a photon's observed wavelength to its wavelength at the time of emission. This ratio is the same as the ratio between the present scale factor, which is 1, and the scale factor when the photon was emitted. Therefore z + 1 = 1/a(t). So a is about 1/z for large z (i. e., early in the history of the universe).
There is another, equivalent, way to conceptualize a(t). It is based on the mathematical forumlation of general relativity. General relativity can be summarized in a single equation:
R^{μ}_{ν} - g^{μ}_{ν}R/2 = (8πG/c^{4})T^{μ}_{ν}This is an equation involving things called tensors, which are just fancy higher-dimensional vectors. R^{μ}_{ν} is called the Ricci tensor. It describes the curvature of spacetime. (μ and ν are indices taking integer values from 0 to 3.) g^{μ}_{ν} is the coordinate tensor; it consists of numbers that specify the metric used to measure distance. R is the Ricci scalar, an averaged value of curvature. G is Newton's gravitational constant, c is the speed of light, and T^{μ}_{ν} is the energy-momentum tensor that describes some given distribution of matter and energy, in terms of things like momentum, density, and pressure. The tensor notation simply makes the equation more compact. In reality, this tensor equation is the same as a system of up to 10 separate but related equations.
From this, assuming a particular metric tensor, if a distribution of matter and energy is specified, the curvature of space can (in principle) be calculated. Conversely, given complete information about curvature, the distribution of matter and energy can be calculated. One way to use the Einstein equation is to assume a metric tensor of a specific form and some distribution of matter and energy, and from these derive simpler differential equations that can be solved, at least approximately. So, as promised, the equation relates the curvature of space to the distribution of matter and energy. (Note that energy as well as matter affects how space curves.)
The first step in applying the equation is to put some constraints on the form of the metric tensor. For purposes of cosmology, we want to have a metric which corresponds to a spacetime that is spatially homogeneous and isotropic. It can be shown that the metric must have the form:
ds^{2} = -c^{2}dt^{2} + a(t)^{2} [dr^{2}/(1-kr^{2}) + r^{2} (dθ^{2} + sin^{2}θ dφ^{2})]Here, r, θ and φ are comoving coordinates already mentioned. dr, dθ, and dφ are infinitesimal changes in coordinate values, in the sense of calculus. ds is the resulting infinitesimal change in distance. c is the speed of light, and k is a constant that reflects curvature. k may be positive, negative, or zero, but it must be constant if the universe is to be homogeneous. That is, the curvature must be the same everywhere (on a large scale). The only other free parameter in this metric is the function a(t), which is none other than the scale factor. This metric is known as the Robertson-Walker metric, after Howard Robertson and Arthur Walker, who derived it in the 1930s.
Usually this metric isn't worked with directly in cosmology. Instead, one uses the metric and the Einstein equation to derive simpler differential equations, most importantly one called the Friedmann equation (after Alexander Friedmann), which involves quantities of more direct cosmological relevance. The Friedmann equation is rigorously derivable from Einstein's equation and the Robertson-Walker metric, but in fact it can be derived, with a little hand-waving, from Newton's original theory of gravity. (Unfortunately, we have to assume you've been exposed to the mathematical form of some of Newton's laws involving energy, as well as a tiny bit of calculus in order to fully understand this part.)
Our plan for deriving the equation is to consider a particle of matter, a "test particle", having mass m and coordinates (r, θ, φ) in our chosen coordinate system. The test particle may be at an arbitrarily large distance r from the origin of the coordinate system, even billions of light years. Therefore, a sphere centered at the origin will contain a certain amount of matter. Because we are assuming that space is homogeneous, we don't need to specify where each individual lump of matter is, only that all matter is distributed uniformly with average density of ρ (mass per unit volume, expressed in units compatible with those used for r and m). It is perfectly true that this idealized situation is not exactly correct in the real universe, but – like such things as "frictionless planes" – it is close enough that we actually get a useful result that is a good approximation for our purposes when we make the simplifying assumptions.
Now, a basic theorem of Newton's theory of gravitation is that the motion of the test particle depends only on the matter located inside a sphere about the origin whose radius r is the same as the particle's distance from the origin, so that the particle lies on the surface of that sphere. In other words, all matter farther from the origin than the test particle is irrelevant to how the particle moves (as long as all that matter is distributed uniformly). Moreover, the particle's motion is the same as what it would be if the entire mass inside the sphere were located at the origin. This reduces the whole problem to a situation involving only two particles.
Let M be the mass of the matter inside the sphere, so that M = 4πρr^{3}/3, since the volume of the sphere is 4πr^{3}/3. Newton's law then says that the force acting on the test particle is
F = GMm/r^{2} = 4πGρrm/3where G is Newton's constant.
Most of the time when one wants to derive equations for the motion of a particle, it is done by appealing to the principle of energy conservation – the total energy of the system is constant in time. The total energy is made up of two parts: kinetic energy and potential energy – energy due to motion and energy due to gravitational force. The mass M at the origin is assumed not to move (in the chosen coordinate system), so it has no kinetic energy. Since the test particle moves as though only one other particle is involved, all its motion is in the radial direction, and its velocity is dr/dt = r′, so its kinetic energy is mr′^{2}/2.
The potential energy is defined in terms of the two masses together, and is given by
V = -GMm/r = -4πGρr^{2}m/3Conservation of energy means that the total energy:
E = mr′^{2}/2 - 4πGρr^{2}m/3is constant as a function of time.
We can now switch to a co-moving coordinate system where x (which is related to r(t) by r(t) = a(t)x) is the unvarying distance of the test particle from the origin. Taking derivatives, r′ = a′x (since x = x(t) is constant). Substituting this into the above gives
E = m(a′x)^{2}/2 - 4πGρ(ax)^{2}m/3Dividing both sides of the equation by a^{2} and rearranging gives
(a′/a)^{2}mx^{2}/2 = 4πGρx^{2}m/3 + E/a^{2}and so
(a′/a)^{2} = 8πGρ/3 + (2E/mx^{2})/a^{2}We have written the equation in this form to isolate the quantity (a′/a)^{2} for reasons that will be clear momentarily. But first, note that a = a(t) was a function of t that didn't depend on the value of x (or m), so the same is true of (a′/a)^{2}. The first term on the right side of the equation is obviously independent of m and x as well. Therefore, 2E/mx^{2} is also independent of m and x. Hence E (the total energy) is proportional to mx^{2}. Hence there must be a constant k such that -2E/c^{2} = k(mx^{2}), so k = -2E/(mc^{2}x^{2}). We threw in c^{2}, the square of the speed of light, so that the mass-energy mc^{2} of the test particle appears in the equation for k, showing that k has units of (length)^{-2}. Although E depends on m and x, it is independent of time – since energy is conserved. Hence k is independent of time as well. k turns out to be a constant related to the overall curvature of space. Using k we can also rewrite the equation for (a′/a)^{2} as
(a′/a)^{2} = 8πGρ/3 - kc^{2}/a^{2}This is the Friedmann equation, which is the most important equation in cosmology. It is named after Alexander Friedmann, who first derived it in 1922. Friedmann was actually a meteorologist who had the audacity to teach himself general relativity. Einstein was, at first, not especially impressed with Friedmann's work because, as we will see, it implies the universe can't be static, and must instead either expand or contract. This was before Hubble proved the universe presently is (apparently) expanding.
All the terms in the equation have dimensions of (time)^{-2}. In particular, a′/a has dimensions (time)^{-1}. But what is it? Well, if you go back to the equation r(t) = a(t)x, then r′ = a′x = (a′/a)r. That should ring a bell. It says that the rate that two points in space are becoming separated is proportional to the separation between the points. Hubble's law! And the "constant" of proportionality (which is actually a time-dependent quantity) is simply Hubble's parameter, i. e. H(t) = a′(t)/a(t).
If we were to derive the Friedmann equation in full generality from Einstein's equation of general relativity, we would have to take into account energy as well as matter. In particular, it is clear that the universe at present is filled with a great deal of energy as well as matter, in the form of the cosmic microwave background photons. Indeed, from the relation E = mc^{2} we know that matter itself is just a form of energy, though there's no way to account for this in Newton's physics. However, just as we can replace m with E/c^{2}, we can also replace matter density, ρ, with a more general quantity involving "energy density", denoted by ε. Specifically, we should replace ρ with ε/c^{2}, to get the equation
(a′/a)^{2} = (8πG/3)ε/c^{2} - kc^{2}/a^{2}In this way, we will be able to deal with matter and energy – and possiblity other things – on an equal footing, as just different forms of energy. (Actually, we will mostly use ρ instead of ε, to denote mass-energy density in general.)
dE + PdV = TdSThe symbols here are used differently from above. E is the internal energy of the gas, P is pressure, V is volume, T is temperature, and S is a quantity called entropy, which is a measure of the "disorder" present in the gas. TdS is also equal to the flow of heat into or out of the given volume. If there is no net heat flow, TdS = 0, and that is the case in the process here. (Such a process is said to be adiabatic.) Therefore we have the simpler case dE + PdV = 0, or equivalently in terms of time derivatives, E′ + PV′ = 0.
These derivatives are easy to calculate. Consider a sphere with radius 1 in comoving coordinates. Its volume is V = 4πa^{3}/3, and so V′ = 4πa^{2}a′. The mass-energy inside the sphere is E = εV = 4πεa^{3}/3. The density ε is (like a) a function of time, so the derivative E′ = (4π/3)(a^{3}ε′ + 3a^{2}a′ε). Plugging into E′ + PV′ = 0 yields 4π(a^{3}ε′/3 + a^{2}a′ε + Pa^{2}a′) = 0. Dividing through by 4πa^{3} gives ε′/3 + (a′/a)ε + P(a′/a) = 0, hence
ε′ + 3(ε + P)(a′/a) = 0This is known as the fluid equation. In terms of mass density ρ, since ε = ρc^{2} and ε′ = ρ′c^{2}, we get
ρ′ + 3(ρ + P/c^{2})(a′/a) = 0This equation is completely independent of the Friedmann equation. But it comes in handy when we take derivatives in the Friedmann equation in order to find the second derivative of a, i. e. the acceleration of expansion, a″.
Taking another derivative of the Friedmann equation with ρ gives
2(a′/a)[(aa″-a′^{2})/a^{2}] = (8πG/3)ρ′ + 2kc^{2}a′/a^{3}We can eliminate ρ′ by using the fluid equation. Upon simplyfying, we have
a″/a - (a′/a)^{2} = -4πG(ρ+P/c^{2}) + kc^{2}/a^{2}Finally, using (a′/a)^{2} from the Friedmann equation gives
a″/a = -(4πG/3)(ρ + 3P/c^{2})Or, in terms of ε,
a″/a = -(4πG/3c^{2})(ε + 3P)This is called the acceleration equation
In order to talk about the evolution of the universe, we have to consider various scenarios about what the universe is made of. Generically, it is "energy" that the universe is made of. However, different forms of energy behave differently in an expanding universe.
Matter is certainly one form of energy. But some care is required in how matter is defined. It turns out that we need to define "matter" as largely what we normally think of it as, namely a collection of particles such as baryons and leptons, provided they are not moving at or near the speed of light. In contrast, photons are never "matter", as they always move at the speed of light. Likewise, neutrinos, which are leptons, are not considered to be matter, even though they may have a small rest mass, since their velocity is always near the speed of light. We define massless particles and particles such as neutrinos which move at relativistic speeds (i. e. near the speed of light) to be radiation instead of matter. Baryons and leptons other than neutrinos are usually considered to be matter. However, in the very early universe when all particles had extremely high energy and so moved near the speed of light, baryons and leptons must be considered to be radiation also, since their kinetic energy was far in excess of their rest-mass energy.
One key distinction betwee matter and radiation is that matter does not contribute in a significant way to pressure, because its kinetic energy is relatively low. However, all forms of radiation, including photons, neutrinos, and relativistic particles do contribute to pressure, since their energy is mostly kinetic.
Another form of energy we will consider a little later is "dark energy", also known as the "cosmological constant". But to begin with, we will only consider models that involve matter, radiation, or both. Whenever we make some assumptions about the presence or absence of matter and radiation in the universe, or about their relative importance, we are specifying part of one particular model of the universe. This will affect the pressure parameter, P, in our equations. Another type of assumption we might make is about the parameter k, at least as to the three possibilities of whether it is positive, negative, or zero.
We haven't talked much about curvature before. If k is zero, we say that space is flat. Geometrically, that means the axioms of Euclidean geometry apply. For example, the three interior angles of a plane triangle on a flat 2-dimensional surface add up to exactly 180 degrees. If k is positive, space is said to have positive curvature. In the 2-dimensional case, the interior angles of a triangle add up to more than 180 degrees. Space with a positive curvature (in two dimensions) is like the surface of a sphere. On the other hand, if k is negative, space has a negative curvature, and the angles of a triangle add up to less than 180 degrees. The standard example is a saddle shape. The spacetime we live in is 4-dimensional, and curvature is nearly impossible to visualize or characterise in terms of triangles, but the distinctions between flat, positively curved, and negatively curved are still valid and important. A choice of positive, negative, or zero curvature is another key feature that applies to a particular model.
Let's first consider a universe that consists (mainly) of matter. In that case, we assume P = 0. The pressure is (essentially) zero, because constituent particles have low average velocity and seldom interact with each other. In this case, the fluid equation says ρ′ + 3ρ(a'/a) = 0. We want to know how the functions a(t) and ρ(t) behave, since that tells us how the universe is expanding (or contracting) and how density evolves with time.
Note that d(ρa^{3})/dt = ρ′a^{3} + 3ρa^{2}a′. But this is just the fluid equation multiplied by a^{3}, so the indicated derivative is 0, and ρa^{3} must be a constant, i. e. ρ = K/a^{3}. In this case, the density of matter declines like 1/a^{3} with increasing scale factor, which is exactly what we would expect, since the amount of matter is constant, but volume grows like a^{3}. Another way to symbolize this without mentioning a specific constant is ρ ∝ 1/a^{3}.
The same thing is true of ε – it really doesn't matter what form of density we consider. What's important is how pressure P behaves for energy in various forms. This is described by an equation called the equation of state. When we have only matter with no radiation or dark energy, the equation of state is simply P = 0.
Notice that it doesn't matter here what k is, since k doesn't appear in the fluid equation. So ρ ∝ 1/a^{3} for any value of k in the "matter only" case. We can similarly ignore k when considering the acceleration a″ in the acceleration equation. When P = 0, this becomes a″/a = -4πGρ/3, so a″ = -4πGρa/3. It follows that a″ is always negative, so the universe is always decelerating when it consists only of (non-relativistic) matter, no matter what k is. However, since ρ ∝ a^{-3}, we have a″ ∝ 1/a^{2}. Thus the deceleration becomes very small in magnitude when a is large. But so far, we don't know how a(t) itself behaves – it might be decreasing for large values of t, even though it is increasing at the present time.
In order to investigate how a(t) varies with t we have to use the Friedmann equation, and so (for simplicity) we first assume the universe is flat, k = 0. Then the Friedmann equation says (a′/a)^{2} = Cρ for some constant C. (C has nothing to do with the speed of light c, and we will use C as a "generic" constant, which might be different each time it is mentioned, yet does not vary with time.) We know ρ ∝ a^{-3} for any k, so a′^{2} = C/a, hence a′ = Ca^{-1/2}. A solution to this differential equation is given by a = Ct^{2/3}, as is easily verified since (d/dt)(t^{2/3}) = Ct^{-1/3} = Ca^{-1/2}. In other words, we have a ∝ t^{2/3}, provided k = 0. Thus a(t) is a monotonically increasing function of time. So the universe in this model is always expanding, but from the discussion in the previous paragraph, the rate of expansion is decreasing to zero as t → ∞.
We can also find the Hubble parameter H(t) = a′/a in this model with k = 0. We have H = a′/a = (t^{2/3})′/t^{2/3} = (2/3)t^{-1/3}/t^{2/3} = 2/3t. Thus H is always positive (the universe is expanding), but decreases steadily to 0, as expected. It is somewhat remarkable how much we can say about the evolution of the universe from our equations, even though the model is rather simplified. For reasons which will eventually come out, the model actually isn't all that drastically simplified from the way we now believe the universe really is. Experimental observations show that the geometry of the universe really is nearly flat. And it is reasonable to expect that matter eventually becomes dominant over energy as a(t) increases. Therefore, provided only that the universe is always expanding, our flat matter-dominated model is not a bad fit for the real universe. (However, this conclusion needs to be altered when we take into account the "cosmological constant".)
In view of the simplicity of, and observational evidence for, a flat universe, it is worthwhile to investigate the conditions when this might be expected. The Friedmann equation can be written as H^{2} = 8πGρ/3 - kc^{2}/a^{2}. We can solve this for k to get k = (a/c)^{2}(H^{2} - 8πGρ/3). Hence k = 0 ⇔ H^{2} = 8πGρ/3 ⇔ ρ = 3H^{2}/8πG.
So there is a magic value of mass-energy density at which k = 0 and the universe is flat, and for this it is immaterial whether the universe consists of matter or radiation or both (since the Friedmann equation doesn't involve pressure). We call this density the critical density ρ_{c} = 3H^{2}/8πG. Note that since H = H(t) varies with time, so does ρ_{c}. We further define another quantity, called the density parameter Ω(t) as the ratio ρ(t)/ρ_{c}(t). As indicated, Ω also varies with time – in general.
With this notation, the Friedmann equation becomes H^{2} = (8πG/3)(ρ_{c}Ω) - k(c/a)^{2} = H^{2}Ω - k(c/a)^{2}, which implies
Ω - 1 = k(c/(aH))^{2}This is an equivalent form of the Friedmann equation, unless Ω = 1. But obviously, the universe is flat ⇔ k = 0 ⇔ Ω = 1 ⇔ ρ = ρ_{c}. This is the desired condition for flatness. Remember this definition of Ω, as we will have frequent occasion to refer to it. Although in general Ω varies with time, if it ever equals 1, then k = 0, and since k is constant, Ω also is always 1. Furthermore, if Ω < 1 ever, then the equation above shows k is a negative constant, so &Omega < 1 at all times. Likewise, if Ω > 1 ever, Ω(t) > 1 for all t.
We can now discuss the evolution of the universe for k ≠ 0 in the two cases of Ω < 1 and Ω > 1. First suppose Ω < 1. This is equivalent to ρ(t) < ρ_{c}(t) for any time t, and also to a value of the constant k that is negative. That is, Ω < 1 ⇔ ρ(t) < ρ_{c}(t) ⇔ k < 0. All these conditions mean space has negative curvature. Now look at the original Friedman equation,
H^{2} = 8πGρ/3 - kc^{2}/a^{2}If k is negative, H cannot be 0, so the universe must always be expanding: a' > 0, and a is always increasing without limit. We know that ρ ∝ 1/a^{3}, so for large a, the first term on the right hand side of the equation is negligible compared to the second term, which we can symbolize by
(a′/a)^{2} = H^{2} ∼ -kc^{2}/a^{2}Therefore, approximately, a′ = const., which implies a ∝ t. That is, if k < 0, the density is less than the critical density, and the universe is negatively curved, then the scale factor a(t) grows linearly with t. Further, H ∝ 1/t.
On the other hand, if k is positive, then Ω > 1, and ρ(t) > ρ_{c}(t) for all t. Since H^{2} is non-negative we must have 8πGρ/3 ≥ kc^{2}/a^{2}. But as before, the term 8πGρ/3 is getting smaller faster than kc^{2}/a^{2}, so eventually H must reach 0, and expansion stops. Of course, the universe doesn't stop dead in its tracks like a ball suspended in mid-air. Instead of expanding, it starts to collapse, H and a′ become negative, and eventually the scale factor a reaches 0 – the so-called "big crunch".
At one time, it was thought to be quite possible that the curvature of the universe could be positive so that this collapse scenario would eventually occur. In some ways, this might be aesthetically appealing, since it might further be possible that the universe would "bounce" when a = 0 and expansion would begin again. As we will explain, many observations seem to rule out this scenario. In particular, all observations suggest that ρ is less than or equal to the critical density, and many observations indicate it is a lot less than the critical value. However, some observations, as well as theory, indicate that the universe should be essentially flat, with k = 0, or extremely close to it.
So evidence is strongly against a positively curved universe. But if k = 0 as it seems, the universe shouldn't be negatively curved either. That would seem to contradict the fact that the density of matter in the universe is much less than the critical density. Clearly, something is amiss. It turns out that there is a very elegant solution to this problem, and we'll explain what it is when we discuss the "cosmological constant". But first, let's summarize what we've deduced so far. Keep in mind that we have been assuming that the universe is "matter dominated", i. e. the effects of radiation pressure are negligible, so we could assume P = 0.
Curvature | k < 0 | k = 0 | k > 0 |
Omega | Ω < 1 | Ω = 1 | Ω > 1 |
Density | ρ < ρ_{c} | ρ = ρ_{c} | ρ > ρ_{c} |
Scale factor as t → ∞ | a(t) ∝ t | a(t) ∝ t^{2/3} | a(t) has maximum |
Expansion rate as t → ∞ | H ∝ 1/t | H = 2/(3t) | H becomes negative |
In theory, matter and radiation blend smoothly into each other, but in practice, in the real universe, most matter moves at very slow speeds compared to the speed of light. Therefore, most of the energy content of matter is its mass-energy: E = mc^{2}, in which kinetic energy, the energy of motion, is negligible. On the other hand, for particles moving at or near the speed of light, kinetic energy dominates. Particles, such as photons, that move at the speed of light have zero mass (for otherwise they would have infinite kinetic energy), but they have non-zero momentum, denoted by p. The energy of such a particle is E = pc. In general, the total energy of a particle with nonzero mass m is E = mc^{2} + p^{2}/(2m). When the particle is moving so fast (large momentum) that the second term dominates, the particle is said to be relativistic, and cosmologists talk of it as being radiation, much like a photon.
Diffuse non-relativistic matter is responsible for only a neglible amount of pressure, since statistically the probability of interaction between such particles is essentially zero. When we experience matter as a gas here on Earth, we are really dealing with an extremely high density compared to the overall density of matter in the universe. In particular, assuming that the current density of matter in the universe is close to the critical density ρ_{c}, it is only about 10^{-26} kilograms per cubic meter. This is why terrestrial gases have pressure – due to their ultra-high density – while matter in the universe has essentially no pressure. But with matter moving at relativistic speeds, i. e. "radiation", there is a pressure P ≠ 0 which must be taken account of.
Recall that the complete fluid equation was
ρ′ + 3(ρ + P/c^{2})(a′/a) = 0If we can't assume P = 0, then we have to know what the relation is between pressure P and density. It can be shown that the equation of state for radiation is P = ρc^{2}/3, though we won't attempt to justify that here. Consequently, for radiation the fluid equation is
ρ′ + 4ρ(a′/a) = 0Remarkably, that's the same as the equation for (non-relativistic) matter, only with a 4 instead of a 3. And so, repeating the reasoning used before, we find that ρ ∝ 1/a^{4}. Intuitively, this is what we should have expected. In terms of number density of particles, we expect the density to fall in proportion to volume, which is proportional to a^{3}. But when the particles are photons (or any relativistic particles), the energy density is decreased by an additional factor of a, to account for stretching the wavelength of the particle by the scale factor, because space itself is expanding. (Remember particle energy is proportional to the frequency, which is inversly proportional to the wavelength).
Assume for simplicity that curvature is zero, so k = 0. We want to undertand how space behaves assuming that radiation is dominant rather than matter, i. e. the radiation density is much greater than the matter density. So the overall energy density behaves like ρ ∝ a^{4} instead of ρ ∝ a^{3}. Using "generic constants" notation as before, the Friedman equation says (a′/a)^{2} = Cρ. Since ρ ∝ a^{4}, we have a′^{2} = C/a^{2}, and so a′ = C/a. Obviously, the solution of this is a = Ct^{1/2}, i. e. a ∝ t^{1/2}. Thus the universe still expands monotonically as a function of time, if it is dominated by radiation, just not quite as fast as if it is dominated by matter (matter density much greater than energy density) – the exponent of t is 1/2 instead of 2/3. We can likewise compute H for the radiation-dominated case. We have H = a′/a = (t^{1/2})′/t^{1/2} = (1/2)t^{-1/2}/t^{1/2} = 1/2t.
Now for the interesting case, where the universe is a mixture of matter and radiation. We will need to know how densities of matter and radiation evolve as functions of time. To keep them straight, let ρ_{m} be the (energy) density of matter, and ρ_{r} be the density of radiation. In the matter-dominated case, i. e. ρ_{m} ≫ ρ_{r}, we have ρ_{m} ∝ 1/a^{3}, and a ∝ t^{2/3} hence ρ_{m} ∝ 1/t^{2}. In the radiation-dominated case, ρ_{r} ≫ ρ_{m}, the exponents are different: ρ_{r} ∝ 1/a^{4}, and a ∝ t^{1/2}, yet still ρ_{r} ∝ 1/t^{2}.
So suppose first that the universe starts off dominated by radiation. This is a reasonable assumption, since the universe is initially extremely hot, and most particles should be relativistic. Thus we have a ∝ t^{1/2} and ρ_{m} ∝ 1/a^{3}, so ρ_{m} ∝ 1/t^{3/2}. But ρ_{r} ∝ 1/t^{2}. So the density of matter is decreasing less rapidly as a function of time than is the density of radiation. Therefore, energy density due to matter will eventually overtake energy density due to radiation, and the universe will become matter dominated.
Once the universe is matter dominated, then a ∝ t^{2/3} and ρ_{r} ∝ 1/a^{4}, so ρ_{r} ∝ 1/t^{8/3}. The matter density always stays ahead of radiation density, because ρ_{m} ∝ 1/t^{2}. Actually, in both cases, we could predict that matter density would win the race over radiation density simply because ρ_{m} ∝ 1/a^{3}, while ρ_{r} ∝ 1/a^{4}, and in both cases a(t) increases monotonically with t.
The picture that emerges from all this detailed consideration of the equations is as follows. The universe has been expanding since the time of the big bang, as measured by growth of the scale factor a(t). However, due to the gravitational effect of both matter and radiation, the rate of expansion (in terms of either a′ or H = a′/a) has been decreasing. This is reflected in the fact that the acceleration equation says a″/a is negative. Further, all this is true regardless of whether the curvature of space is negative, zero, or positive.
Initially, the energy density of radiation was much larger than that of matter. However, at some point, since radiation density decreases faster than matter density, the latter overtook the former, the universe became "matter dominated", and it has and will remain so. We might ask at what point the transition from radiation dominance to matter dominance occurred. The answer is that it depends on both Ω (or curvature) and on the present value of the Hubble parameter H. Given the best current observational evidence that Ω = 1 and H is about 70 kmps/Mpc, the time works out to 47,000 years (1.5×10^{12} seconds), and the scale factor was 2.8×10^{-4}. Since redshift z and scale factor are related by z = 1/a - 1, this corresponds to a redshift of about 3570.
We can also work out the approximate temperature at the time matter density exceeded energy density. Up to the time that this occurred, expansion of the universe was dominated by radiation. The energy of a photon is inversely proportional to its wavelength, which is proportional to a, so if energy is expressed in terms of temperature, we have T ∝ 1/a. But during the time of radiation dominance, a ∝ t^{1/2}, so T ∝ a^{-1/2}, i. e. T = Ct^{-1/2} for some constant C. It turns out C is about 10^{10} if t is measured in seconds and T in degress Kelvin:
T ≈ 10^{10} × t^{-1/2}Therefore, the temperature was about 8000 K (to the nearest thousand) when matter became dominant over radiation.
Nothing terribly dramatic happened in the universe at the time when matter became dominant. The main difference was that expansion sped up slightly. In the simplest case of a flat universe, a became proportional to t^{2/3} instead of t^{1/2}. The transition was gradual, since around that time, when the densities of matter and radiation were about the same, the actual fluid equation in effect was not as simple as assumed above. Apart from the expansion rate, matter and photons did not start behaving any differently when the transition occurred.
For the sake of completeness, here's a summary of some differences between a radiation-dominated and a matter-dominated universe, assuming (as seems realistic) that Ω = 1.
Radiation-dominated | Matter-dominated | |
Scale factor | ||
Matter density | &rho_{m} ∝ 1/a^{3}, &rho_{m} ∝ 1/t^{3/2} | &rho_{m} ∝ 1/a^{3}, &rho_{m} ∝ 1/t^{2} |
Temperature vs. scale factor | ||
Temperature vs. time | ||
Hubble parameter |
Note that during the entire time of radiation dominance, and beyond, matter and radiation were in thermal equilibrium. That is, photons and (charged) matter particles (electrons and nuclei of hydrogen and helium) interacted readily with each other, so their average energy per particle was the same. It was only somewhat later that photons and matter decoupled – the period of "recombination", when free electrons bound to nuclei to make neutral atoms. This occurred around 300,000 years after the big bang. In particular, the era of nucleosynthesis, which we will discuss later, took place during the time of radiation dominance.
After photons and matter decoupled, the temperature of photons continued to decline as 1/a, while the scale factor ceased to have a direct effect on the temperature of matter.
The problem is that the evidence strongly indicates that the universe is flat: Ω = 1. Since Ω is directly, or indirectly, an important parameter in the Friedman equation, many calculations depend on it. When these calculations are carried out in order to come up with testable predictions, it turns out that the predictions (sometimes separately, or sometimes in combination) are consistent with measurements only if Ω = 1. There is also a good theoretical reason to expect Ω = 1.
Let's define the density ρ_{m} as the present energy density of all matter and radiation whose existence we can either measure directly or infer indirectly by various means. If ρ_{c} is the critical density as described above, let Ω_{m} = ρ_{m}/ρ_{c}, and set Ω_{Λ} = Ω - Ω_{m}.
ρ_{m} takes into account matter that is visible (as stars, galaxies, and gas clouds) as well as other matter that is definitely not visible, but whose existence is necessary to account for phenomena such as the rotation rates of galaxies, the velocities of galaxies in large galaxy clusters, the large-scale distribution of galaxies, and "gravitational lensing". This will be explained in much more detail under the heading of dark matter. While such observations may not account for all matter, in general they never result in an estimate of Ω_{m} > .3.
A completely different type of observational data comes from measurements of the cosmic microwave background (CMB) radiation.
The observational and theoretical evidence includes:
(a′/a)^{2} = 8πGρ/3 - kc^{2}/a^{2} + Λ/3This will make it possible to define a "density" ρ_{Λ} such that 1 - Ω_{m} = Ω_{Λ} = ρ_{Λ}/ρ_{c}.
For the moment, just assume this is a legitimate thing to consider. The fluid equation
ρ′ + 3(ρ + P/c^{2})(a′/a) = 0won't change if Λ is constant, since any energy associated with Λ is constant and doesn't change with time. Taking time derivatives in the new Friedmann equation yields, when combined with the fluid equation, a new acceleration equation:
a″/a = -(4πG/3)(ρ + 3P/c^{2}) + Λ/3We notice right away that if Λ > 0 it will tend to add some acceleration to the expansion, while it will add some deceleration if Λ < 0.
Now we can define ρ_{Λ} = Λ/(8πG). Plugging that into the "new" Friedmann equation gives
(a′/a)^{2} = (8πG/3)(ρ + ρ_{Λ}) - kc^{2}/a^{2}Since Λ can be understood as a form of energy, it is associated with a pressure P_{Λ}. It will satisfy a fluid equation by itself:
ρ_{Λ}′ + 3(ρ_{Λ} + P_{Λ}/c^{2})(a′/a) = 0But since Λ is constant, ρ_{Λ}′ = 0, so we must have this "equation of state" for P_{Λ}:
P_{Λ} = -c^{2}ρ_{Λ} = -c^{2}Λ/(8πG)If Λ < 0, then ρ_{Λ} < 0, which makes it a rather odd form of energy density – negative energy. But if Λ > 0, P_{Λ} < 0 – negative pressure. That isn't actually so weird. It just means that positive Λ is something that makes the universe expand, as opposed to ordinary (positive) energy pressure, which makes the universe contract under the force of gravity. So Λ is in a sense a form of "anti-gravity". ρ_{Λ} can be thought of as the energy density of empty space, vacuum energy, or (as it is often called) dark energy.
We'll save a more detailed discussion of the nature of Λ for pages on the dark energy and the cosmological constant. The point here is that Λ can be introduced into the Friedmann equation as we've described. The benefit of that is it gives us an additional parameter for a model universe. We can regard observational data, such as the distribution of galaxies, galaxy redshifts, and CMB measurements as data which help us determine empirically the value of the quantities Λ, ρ_{Λ}, and Ω_{Λ}. Since Ω_{Λ} = 1 - Ω_{m}, we see Ω_{Λ} ≈ .7 at the present time. But Λ = (8πG)ρ_{Λ} = (8πG)(ρ_{c}Ω_{Λ}) = (8πG)(3H^{2}/(8πG))Ω_{Λ}) = 3H^{2}Ω_{Λ}. This allows us to compute Λ from observations of H and Ω_{Λ}. (And we could equivalently have used this as the definition of Ω_{Λ}, – i. e. Ω_{Λ} = Λ/3H^{2} – instead of starting from our definition of ρ_{Λ}.)
Is all this algebraic legerdemain some sort of cheating to make the model come out right? No, not really. What it amounts to is adding a bookkeeping tool that allows us to keep track of an additional parameter in the model, which can be determined empirically. However, it does mean that the model isn't the last word. We would like to have some sort of theoretical way to determine Λ, at least approximately. Unfortunately, such a derivation is nowehere in evidence at the present time. It is a major open problem. Necessarily it must be bound up with a physical theory which is able to connect general relativity and quantum mechanics. This is because, on one hand, Λ can also be incorporated into Einstein's equation of general relativity (as part of the energy-momentum tensor), and the Friedmann equation with Λ can be derived rigorously from that. But on the other hand, if Λ represents vacuum energy, it (presumably) arises from processes (involving creation and annihilation of "virtual particles") governed by quantum mechanics.
So we'll leave aside for now questions about the "real" meaning of Λ and instead look at its effect on the evolution of the universe.
(a′/a)^{2} = (8πG/3)(ρ + ρ_{Λ}) - kc^{2}/a^{2}Now, ρ_{Λ} = Λ/(8πG) is a constant. But ρ is the combined density of matter and energy. As the universe expands, this is dominated by the density of matter, so ρ ∝ 1/t^{2}, hence ρ → 0 as t → ∞.
Observationally, and because of inflation, the evidence is now very good that the universe is flat, so k = 0. Therefore the Friedmann equation, for large values of the time t, is much simpler:
(a′/a)^{2} = 8πGρ_{Λ}/3The right side of this equation is a constant, which is the square of the Hubble parameter for large values of t. So the constant is positive, which implies ρ_{Λ} and also Λ itself are positive. It is natural to denote the square root of this constant by H_{0}, so that
H_{0} = (8πGρ_{Λ}/3)^{1/2} = (Λ/3)^{1/2}So the Friedmann equation is thus extremely simple:
a′ = H_{0}aThe solution of this is
a(t) = e^{H0(t-t0)}(This includes a "constant of integration" having the value e^{-H0t0}.)
So the final result is that in a flat universe with a (positive) cosmological constant, the scale factor a(t) eventually increases exponentially for large t, when the effect of Λ comes to dominate over both matter and energy. An exponential rate expansion, of course, is eventually extremely large, certainly much larger than in the matter dominated case where a(t) ∝ t^{2/3}. That's a direct result of the Hubble parameter becoming constant instead of going to 0 as t → ∞.
What may be the really surprising thing is that this exponential expansion seems already to have begun. If we use the best observational values of Ω = 1 and Ω_{m} ≈ .3, then solving the appropriate form of the Friedmann equation (which doesn't assume mass density is 0) implies that ρ_{Λ} exceeded the mass density about 10 billion years after the big bang. Given the present age of about 13.7 billion years, that's "only" 3.7 billion years ago, when the Earth already existed.
Up to that point, the expansion of the universe had been slowing down. But for the last 3.7 billion years the expansion seems to have been accelerating. In the far future, assuming Λ doesn't change, the expansion will be so rapid that the solar system, and eventually even atoms, will become radically inflated. Accordingly, this melodramatic scenario has been called the "big rip".
However, since we don't understand where the cosmological constant comes from in the first place, there's no especially good reason to suppose it won't change drastically at some point. We just don't know.
One of the most puzzling questions in astrophysics up to the end of the fourth decade of the 20^{th} century was the mystery of how the Sun and other stars were able to produce their enormous output of energy. After Einstein developed the special theory of relativity, which implied the convertability of matter into energy according to the equation E = mc^{2}, it became apparent that the Sun's energy must be produced by some sort of matter conversion process, but the specific details remained unknown.
In 1920 Arthur Eddington first suggested that the conversion of hydrogen into helium by a process of nuclear fusion might be the sought after mechanism of energy production. But several advances in physics were required before this could be explained in any detail. The first major advance was the theory of quantum mechanics, which made it possible to do calculations of physical processes at the very small size scale of atomic nuclei. Quantum mechanics was rapidly fleshed out in the later 1920s. The second important advance was the discover of neutrons, by James Chadwick in 1932. It was only then that physicists finally began to understand both of the fundamental particles – protons and neutrons, known together as nucleons – which form atomic nuclei.
Around 1928 George Gamow was using quantum mechanics to make theoretical studies of the nuclear force that held atomic nuclei together (even before neutrons were known). During the next 10 years, Gamow and others developed the means of computing the amounts of energy involved in nuclear reactions (between protons and neutrons) and studied sequences of reactions that could liberate energy at the temperatures and densities existing within stars like the Sun. Finally, in 1939, Hans Bethe gave the first convincing description of a series of reactions that could build nuclei of the most common form of helium (which is abbreviated as ^{4}He, where the superscript denotes the total number of nucleons).
When the Manhattan Project was begun several years later, Bethe was appointed to a high-ranking position (Director of the Theoretical Division). It was the job of the many physicists working in that division to make detailed calculations of the dynamics of nuclear reactions, which had to be understood in order to build a workable atomic bomb. As a result of such work, physicists became familiar with the complexities of computing the details of many types of nuclear reactions.
As noted above, Gamow was also interested in cosmology. He was among the originators of the big bang theory. The main line of reasoning was that if the universe had been expanding for a sufficiently long time at anything like the rate it currently is, then at a certain time in the past (several billion years ago), the whole universe must have been enormously hotter and denser than it is now. In fact, the conditions must have been right at some point, in terms of temperature and density, for fusion reactions to occur among protons and neutrons that would build up helium and a few other light nuclei. And therefore it might actually be feasible to compute how much of each possible nuclear species was created.
In the late 1940s Gamow, along with various others, including Ralph Alpher, Robert Hermann, Enrico Fermi, and Anthony Turkevich, made the necessary calculations, using several reasonable assumptions about temperature, mass density, and the proportions of neutrons and protons that would have initially been available. The results were spectacular. Already by 1952 when Gamow wrote his book The Creation of the Universe for general readers, he was able to predict that, beginning about 5 minutes after the big bang and proceding for about half an hour after that, the primary kinds of atomic nuclei that would form should be hydrogen (^{1}H) and helium (^{4}He), plus a little bit of deuterium ("heavy hydrogen", ^{2}H). By weight, he figured, the results would be somewhat more than 50% ^{1}H, somewhat less than 50% ^{4}He, and about 1% ^{2}H.
These numbers were close to the amounts of the various species that could be measured at the time, though now with much more accurate observations, it seems that the proportion of ^{4}He was somewhat overestimated (it's actually about 24% by mass), and the proportion of ^{2}H was greatly overestimated. (Actually is only about .01%.) Everything else – including tiny amounts of helium-3 (^{3}He) and lithium-7 (^{7}Li) – is negligible. In spite of not getting the numbers exactly right, Gamow correctly regarded the ability to make predictions of hydrogen and helium abundance which were definitely in the right ballpark as a solid success for the big bang theory.
Gamow also wrote in the book that the temperature of primeval photons in the universe today should be 50 K. This was more than an order of magnitude larger than the correct value (2.725 K) – but not so bad compared to temperatures at the time of nucleosynthesis that were 8 orders of magnitude higher. Gamow actually made two mistakes in coming up with his estimate. He had the right temperature at the time of nucleosynthesis, because that's determined by nuclear physics, as we will explain. But he used an age of the universe of only 3 billion years. And he assumed that temperature decreased as 1/(time)^{1/2}. That's correct when radiation is dominant over matter. But after matter becomes dominant, temperature decreases more rapidly, as 1/(time)^{2/3}. Unfortunately, no one thought to actually look for the primeval photons – before they were discovered by accident – in order to measure the correct temperature.
Let's examine, in more detail, how nucleosynthesis calculations work. We will find that, using the big bang model, we get additional predictions which are both important in themselves and also provide significant evidence as to the correctness of the model.
The first point of interest is the temperature level at which we should expect these reactions to occur. Since there is a close relation between the temperature and the time elapsed since the big bang, this also allows us to identify the time period in which the reactions occur.
Now, "temperature" is defined in terms of the average amount of energy possessed by particles present. In this context, energy is usually measured in millions of electron-volts (MeV), where 1 eV is the amount of energy required to move one electron across a 1 volt potential difference. It turns out that 1 eV is equivalent to about 11,600 degrees K. (This is the inverse of the Boltzmann constant, which has a value of 8.619×10^{-5}.) In other words, particles at that temperature have a kinetic energy of about 1 eV. We will use either way of expressing an amount of energy, according to what is convenient. As indicated in the table above, at extremely high energies, above 10^{13} K, most matter is in the form of free quarks. There are also photons, which are quanta of light, and leptons (electrons and neutrinos). All of these particles are in thermal equilibrium, which means they interact with each other and therefore carry similar amounts of energy.
Around 10^{13} K, quarks become bound to each other in the form of hadrons. A hadron, by definition, is any particle made up of quarks. The class of hadrons includes protons and neutrons (made up of three quarks each), as well as unstable particles which are mostly heavier and also made of three quarks, plus mesons, which have only two quarks each. Since all hadrons except for protons and neutrons are unstable and decay in much less than 10^{-5} seconds, by that amount of time after the big bang, the only hadrons left are the (relatively) stable baryons (made up of three quarks each) – protons and neutrons.
At this point, there are also photons, electrons, and neutrinos around. By the time that the temperature drops to 10^{10} K (1 MeV), neutrinos have ceased to interact with the other particles, so they are no longer in thermal equilibrium with the others and play little further role. All the other particles continue to interact via the strong, weak, or electromagnetic force.
Now, the typical amount of binding energy in composite nuclei (such as deuterium, helium, and lithium) ranges from 1 MeV per nucleon (for deuterium) up to somewat less than 10 MeV per nucleon. What this means is that if nuclei are hit by another particle, such as a photon, that has a similar energy, they are likely to be blasted apart. In other words, at a temperature of more than about 10^{10} K composite nuclei simply cannot exist for very long, and therefore nucleosynthesis is impossible.
At lower temperatures, composite nuclei can theoretically form and avoid destruction, since most photons no longer have enough energy to disrupt them. However, reactions among more than three particles at the same time are extremely unlikely, so for all practical purposes, only interactions between two particles at a time occur. Two protons will almost never interact at this temperature because they both have one unit of positive charge, and the electromagnetic repulsive force between them exceeds the attractive nuclear force. Likewise, the nuclear force even between two neutrons isn't enough to hold them bound together in a "dineutron" for very long. However, the combination of one proton and one neutron – a nucleus of hydrogen-2 (deuterium) – is stable, barely, but the binding energy isn't large. At a temperature of 10^{10} K, deuterium doesn't last very long either.
The binding energy of deuterium is about 1 MeV per nucleon. Even at the corresponding temperature, many photons will have a lot more energy and hence be able to destroy deuterium. Therefore, only when the temperature drops by another factor of 10, about 300 seconds after the big bang, does a deuterium nucleus survive long enough to interact with another proton or neutron to form ^{3}He (helium-3) or ^{3}H (hydrogen-3, also known as tritium). All these nuclei are sufficiently stable at this temperature that nucleosynthesis can proceed in earnest. This delay due to the fragility of deuterium is known as the "deuterium bottleneck".
The energy level at 10^{10} K, about 1 MeV, is an important energy level for the following reason. The masses of protons and neutrons in energy units are 938.3 MeV and 939.6 MeV, respectively. So the difference in mass is only 1.3 MeV – roughly the same as the ambient "temperature". Now, protons and neutrons can convert into each other via the bidirectional reaction:
n + ν ⇄ p + e^{-}(ν is an electron-type neutrino, which has neglible mass), and a similar reaction involving anti-electrons (positrons) and anti-neutrinos. The mass of an electron is .511 MeV, so together a proton and electron have a mass of 938.8, which is still about .8 MeV less than the mass of a neutron. (And incidentally, since .5 MeV is the mass-energy of an electron, photons with energy above 1 MeV can spontaneously create a positron-electron pair. But this ceases to be possible right at the temperature of 10^{10} K, about 1 second after the big bang, so right in this time period electron-positron pairs cease to be created spontaneously.) Since mass-energy must be conserved, it has to be made up for in the form of kinetic energy – and at this temperature, the difference is just about the amount of kinetic energy carried by an electron. (Since the rest mass of an electron plus its kinetic energy totals 1 MeV.)
Above this 10^{10} K temperature level the reaction is able to proceed in either direction because electrons (and positrons) have sufficient kinetic energy to make up for the mass difference between protons and neutrons. But at a lower temperature, the reaction can go in only one direction, so neutrons lost in the reaction
n + ν → p + e^{-}cannot be replaced by the reverse reaction. Essentially, the universe can never have more neutrons than it does at that point.
Since protons are a little lighter than neutrons, statistics favor a larger number of protons than neutrons resulting, although their numbers were nearly equal up to this point. It is possible to compute what the proton:neutron ratio should be, and it turns out to be about 5:1 at the time the neutron-forming reaction ceases to occur.
Of course, there is no reason based on energy that the reaction in which a neutron decays into a proton and an electron can't continue to occur. In other words, neutrons are not stable. Their half-life is about 614 seconds. The net result is that in the 300 or so seconds until nucleosynthesis can actually begin, a certain additional fraction of neutrons will decay. Once nucleosynthesis begins, neutrons become bound into nuclei of deuterium, tritium, and helium. Neutrons bound in a nucleus are stable. By the time nucleosynthesis is nearly complete, the proton:neutron ratio is about 8:1, and after all neutrons are bound into nuclei there is no further change in the proton:neutron ratio.
After all neutrons have all been bound into nuclei, the only further reactions involve protons and composite nuclei. Eventually, the process of nucleosynthesis stops, because of the expansion of the universe. The expansion means that the average distance between particles increases, so there is simply less chance of a collision. Further, collisions between a pair of nuclei containing more than two protons in total tend not to occur, because of the electrostatic force of repulsion. The result is that only a very small number of nuclei with three or four protons (lithium and beryllium) are formed. Finally, since ^{4}He has substantially more binding energy than ^{2}H, ^{3}H, or ^{3}He, it will be almost the only nuclear species left besides ^{1}H when the music stops.
So the end result is that almost all the protons and neutrons end up as either ^{1}H or ^{4}He. The 8:1 ratio of protons to neutrons means that there will be only one ^{4}He nucleus (containing 2 neutrons and 2 protons) for every 14 ^{1}H. The percentage of mass in the form of helium to all (baryonic) mass will therefore be 4/(4+14) = 2/9, or about 22%. This is in fact very close to what is actually observed – about 24% helium. If certain refinements are made in the calculations, the result is even closer to the correct percentage.
A reasonable question one might ask is: How do we know what the percentage of helium created by nucleosynthesis actually was? After all, helium is also created in stars. Fusion of hydrogen into helium is in fact the primary source of energy generated within stars. Therefore, stars will have an excess of helium, compared to its primordial abundance. For example, the Sun's atmosphere (by mass) is 28% helium, 70% hydrogen, and 2% everything else.
Very large stars have short lifetimes and end as supernovae. Heavier elements are produced only in supernova events, and so galaxies which give evidence of only small amounts of such heavier elements must be relatively young. (They also are typically very distant, since it is the distant galaxies which are those we see at a young age.) Young galaxies, identified by their scarcity of heavy elements, have had little time to create new helium in their stars. The gas out of which galaxies form is essentially the same as it was at the time of nucleosynthesis, so their proportion of helium is about the same as what was originally produced. Examining the spectra of young galaxies gives us the figure mentioned above (about 24%) as the percentage of helium in all matter resulting from nucleosynthesis.
The close agreement between observation and theoretical predictions of the big bang model is powerful evidence in favor of the model. It means that the model has passed a crucial test. But since there is no other equally simple theory which predicts the percentage of helium, the big bang model remains more or less alone as the most likely model.
We can next ask: Are there any other predictions that can be made from nucleosynthesis calculations, assuming the same conditions as existed in computing the hydrogen:helium ratio? It turns out that there are some very important predictions that can be made. First, let's list the main factors that affect the proportion of different nuclear species that will be produced:
Nucleosynthesis ends when all neutrons have either decayed into protons (^{1}H nuclei) or else been incorporated in more complex stable nuclei – ^{2}H (deuterium), ^{3}He (helium-3), ^{4}He (helium-4), ^{6}Li (lithium-6), or ^{7}Li (lithium-7). (No other nuclei with at most 3 protons are stable, and fusion between nuclei in this list are rare.)
All of the above factors need to be incorporated into calculations in order to obtain precise results. Since some proportions are very small (the fraction of ^{7}Li in the total is only about 1 part in 10^{9}), precision is important. Fairly complicated calculations are required to obtain results with the necessary precision, but they have been done for the whole range of possible values of the parameters listed above.
It turns out that the proportion of helium-4 depends mainly on the ratio of protons to neutrons. This is good in that it makes the calculations easy, and consequently the correct prediction of this proportion was relatively easy to obtain. But at the same time, it is more difficult to reason backward from the observed abundance of ^{4}He in order to make inferences about the values of other parameters which are not as easy to estimate a priori as the proton/neutron ratio.
On the other hand, the proportion of ^{2}H is very sensitively dependent on proton and neutron particle densities. (A higher density makes it more likely that deuterons will either be destroyed, by high-energy photons, or else combine into heavier nuclei, so the higher the density, the fewer deuterons survive.) This means that we can take the proportion of deuterium at the end of nucleosynthesis and reason backward to infer what the densities of protons and neutrons must have been. Now, all deuterium existing in the universe today is essentially primordial and was created at the time of nucleosynthesis, because stars do not make deuterium; they only consume it. It does no good to look for deuterium in stars because essentially there isn't any.
But there are various gaseous aggregates around that have never been part of a star and in which we can measure the proportion of deuterium. Most such measurements that have been made are fairly consistent and imply that the primordial proportion of deuterium is about 1 part in 10,000. Then reasoning backwards from that, we can infer the density of protons and neutrons at the time of nucleosynthesis.
This is an extremely important number. One of the key observational questions in cosmology is to determine how much total mass exists in the universe, because that determines the strength of the gravitational force available to affect the overall shape of the universe. The best way to express this number is as a fraction of the total amount of mass which would be required for the universe as a whole to have a "flat" geometry, so that the density parameter Ω = 1. The result of the calculations is that the percentage of protons and neutrons that exist compared to what would be required is about 4%. As protons and neutrons are essentially the only baryons around since about 10^{-5} sec. after the big bang, they make up what is called baryonic matter, and their fraction of the "critical density" required for a flat universe is denoted by Ω_{b}, which has a value of about .04.
The abundances of helium-3 and lithium-7 also depend on this same baryon density, though not as sensitively as with deuterium. When measurements of the actual abundances of these nuclear species are made and the necessary baryon density is computed, the results are in satisfying agreement with those from deuterium. Therefore, we have not only additional successful predictions from the theory of big bang nucleosynthesis, we also have a good estimate of Ω_{b}.
Since all visible matter we know of (stars, interstellar gas, planets, etc.) is made of baryonic matter, and there is good independent evidence that the geometry of the universe is flat, the conclusion is that at most 4% of all mass is in the form of visible matter. (Matter that is actually visible in telescopes is probably a lot less, since much of it isn't luminous.) The other 96% has to be something else: "dark matter" and/or "dark energy". (In fact, visible matter seems to be only at most 25% of baryonic matter. Everything else is dark.)
Another result has to do with the density of neutrinos in the primordial plasma up to the time of nucleosynthsis. This density also enters into the calculations, and it plays an especially important part in determining the helium-4 abundance (along with the proton/neutron ratio). As it happens, neutrinos cease to interact with other particles and fall out of thermal equilibrium with them right around 1 second after the big bang. But their density up to that time affects both the rate of expansion of the universe and the proton/neutron ratio, both of which in turn affect the helium-4 abundance.
Now, there are known to be at least three different types of neutrinos, corresponding to the known leptons (electrons, muons, and tauons). It is conceivable that additional leptons and their corresponding neutrinos could exist, though none have actually been observed. However, if there were one or more additional types of neutrinos, this would affect the total neutrino density in such a way that the calculated helium-4 abundance would disagree with the observed value. Consequently, nucleosynthesis calculations have also proven that there are no as-yet unobserved types of leptons and neutrinos. An extremely important aspect of the standard model of particle physics has thus been determined from cosmological theory and observation.
But the list of interesting information that can be derived from nucleosynthesis calculations still isn't done. The density of photons is also important. Photons participate in most particle interactions, such as production of positron-electron pairs and reactions that synthesize heavier nuclei from lighter ones. It is estimated that there must have been about two billion photons present for each nucleon.
Gamow and his collaborators noticed this early on, and it enabled them to predict the existence of a significant amount of background electromagnetic radiation (which consists of photons). They also calculated that due to the expansion of the universe, this radiation would now appear to be red-shifted into the microwave range, corresponding (in their calculations) to a temperature of about 50 K. (This was off from what we now know to be the correct value, about 2.725 K, as discussed above.) This radiation is what we now call the cosmic microwave background, the observation of which plays such an important role in modern cosmology.
Unfortunately, although these results were published (even in a popular book of wide circulation), they weren't taken very seriously, and it was more than 15 years before the CMB was actually observed, by accident. But we have to be a little bit understanding, and realize that around 1950 no one (even Gamow) could grasp the extent to which cosmology might be the kind of experimental science it is now recognized to be.
With even more detailed calculations, it is possible to estimate certain other quantities. From the observed abundance of deuterium, the density of baryonic matter in the universe can be estimated. This is consistent with observations, and similar predictions involving observed abundances of other light elements (helium-3 and lithium-7) are also consistent.
The expected density of neutrinos can also be predicted from this kind of calculation. A consequence of such computations is that there can be only three different types of neutrinos, which is exactly what seems to be true.
A number of properties of this cosmic microwave background (CMB) have been measured in balloon-borne and satellite experiments. These include the amplitude of minute variations of temperature that result from pressure/density fluctuations in the cosmic plasma at the time of the decoupling of matter and ratiation. They also include the spectrum of angular sizes of such temperature fluctuations. All of these observed values are in accord with theoretical predictions.
A distribution of just this kind is very close to what is predicted by variations in the distribution of matter that correspond to observed density inhomogeneities in the cosmic microwave background and the way the universe would be expected to expand under the influence of gravity.
The main problems were:
The theory of inflation, conceived by Alan Guth in late 1979, is an ingenious enhancement to the big bang theory which easily resolves each of these problems. But until very recently, inflation has had little independent confirmation – that is, predictions about the universe other than the very features which inflation was added in order to explain.
Now, however, ongoing observations keep adding incremental evidence that the inflationary scenario is correct. These observations are based on detailed measurements of the cosmic microwave background. The first sort of evidence is that we can detect no telltale signs of spatial curvature from the CMB, implying that Ω is essentially equal to 1, as inflation requires. The second sort of evidence involves other characteristics of the CMB, such as the distribution of angular sizes of its minute temperature variations and polarization of CMB photons that would indicate inflation had occurred. Since Inflation is an enhancement to the big bang model, such evidence for inflation is additional evidence for the big bang itself.
There are in fact many different possible models of inflation. In general, they involve a quantum mechanical "scalar field" that has a particular potential energy profile as a function of field strength. This field is called the inflaton field, but merely giving it a name doesn't explain where the field comes from or what laws govern its behavior. Ideally, of course, this inflaton field could be described in terms of the same unified theory – which doesn't yet exist – that encompasses the known fundamental particles and forces.
If inflation really did occur as supposed, then one can ask why it is that there are any inhomogeneities in the universe at all. Why do stars, galaxies, clusters of galaxies, and clusters of clusters exist, instead of a featureless and very thin gas of photons and atoms?
Here quantum mechanics comes to the rescue. Quantum mechanical fluctuations of energy necessarily occur even in an absolute "vacuum", which turns out not to actually be empty at all. Although these fluctuations occur on a length scale far smaller than the size of a quark, the process of inflation would have magnified them to a macroscopic size – and this is precisely what produces the mottled pattern of minute temperature variations seen in the cosmic microwave background.
The main question, then, is how did the galaxies themselves form, and how did they come to be organized into clusters and superclusters? The answers to these questions depend on: (1) the details of the inhomogeneities of matter which existed at the time matter decoupled from radiation about 350,000 years after the big bang; (2) the relative proportions of ordinary (baryonic) matter, nonbaryonic dark matter, and dark energy which make up the stuff of the universe.
The matter inhomogeneities at the time of decoupling are reflected in the minute temperature variations that can be observed in the CMB. These variations are none other than the quantum fluctuations that were magnified so greatly during the era of inflation. And these inhomogeneities of density are precisely what has led to the distribution of matter in galaxies and galaxy clusters and clusters of clusters that we observe today.
A natural question to ask is what came first – the smallest structures (galaxies), followed by galaxy clusters, followed by superclusters? Or was it the other way around, with the largest structures forming first and then breaking apart into smaller structures? The answer depends on characteristics of the matter we can't see – the nonbaryonic dark matter. If this consists mostly of slow-moving ("non-relativistic") particles – called "cold dark matter" – then smaller structures should have formed first. This is the "bottom-up" scenario. But if dark matter is mostly particles that move at or near the speed of light, such as neutrinos ("hot dark matter"), the opposite is true, the "top-down" scenario.
Observations of large-scale structure at the present time and extensive computer modeling support the bottom-up scenario and cold dark matter. The indications are that the largest structures, the superclusters, have formed most recently, as galaxy clusters have tended to drift together. This is also supported by what we know about inhomogeneities of matter reflected in the CMB – these are of a size that favors formation of galaxies first.
The earliest stars of course consisted of ordinary baryonic matter, but they formed within regions where the much more abundant nonbaryonic dark matter was slightly more dense than elsewhere. Such over-dense regions pulled early stars and star-forming gas into themselves, and eventually led to the formation of the first galaxies, perhaps 500 million years after the big bang. Even today, galaxies always seem to be embedded in larger (and invisible) roughly spherical globs of dark matter that may have diameters 10 times as large as the visible galaxies. This can be deduced from the way a galaxy's stars rotate about its center, and it was one of the earliest clues for the existence of dark matter.
We are now only just barely able to observe any features in the univers from this time, because the light from these features has been traveling for over 13 billion years, so their apparent distance is over 13 billion light-years. This is at a redshift of about z = 8. As far as we can tell, objects so distant and so early are rather irregular in form, as one would expect. However at smaller red shifts such as z = 5, about 900 million years after the big bang, objects appear to be surprisingly more like "normal" galaxies. And this presents a problem, because our current theories of galaxy formation make it difficult to understand how "normal" galaxies could have formed so quickly.
So there is very little we really know about this early "dark age" period in the history of the universe, because it is at the limits of our instruments today. The important questions are whether our theories of how galaxies form can account for what is actually there.
We can express Ω as the sum of three terms: Ω = Ω_{Λ} + Ω_{nb} + Ω_{b}. Ω_{Λ} is the contribution from dark energy, Ω_{nb} is the contribution from nonbaryonic dark matter, and Ω_{b} is the contribution from baryonic matter. The best current estimates for these terms are: Ω_{Λ} ≈ .7, Ω_{nb} ≈ .26, and Ω_{b} ≈ .04.
Baryonic matter consists of both luminous matter – such as stars, galaxies, and interstellar gas excited by hot stars – and matter that isn't luminous – such as brown dwarf stars, burned-out very old stars, black holes, and most interstellar gas. The composition of baryonic matter, both that part which is dark and that which is not, is pretty well understood. About 24% of it, by mass, is helium-4, and almost all the rest is ordinary hydrogen.
Although there are a number of ideas as to what the nonbaryonic matter consists of, we don't have any confidence at all as to what it consists of. The leading possibilities are:
There are two reasons we know so little about nonbaryonic dark matter candidates. The first is that they can interact with baryonic matter only through the weak force and gravity. Both of these forces are very weak, especially gravity, so the probability of interactions is very low. The second reason is that some of the best candidates must be very massive, so they cannot be created in existing particle accelerators and (hence) can't be easily studied experimentally. Since they are beyond the range of experiments, we can guess little about their properties, and therefore don't really know what sort of signals to look for.
However, there is hope for identifying some types of dark matter before too long.
But since baryonic matter does exist, not all the quarks were annihilated. So there must have been about a billion plus three quarks for every billion antiquarks. After annihilation there would be just three quarks left, and two billion photons, since two photons are produced in each annihilation. The three quarks subsequently combined into one baryon (a proton or neutron), leaving about 1 baryon per 2 billion photons.
The mystery is why that particular ratio of quarks to antiquarks existed. There obviously must have been a violation of charge symmetry (C symmetry). We know this can occur, but we don't know what mechanism operated to produce the actual quark-antiquark ratio. And so we don't know roughly what time this asymmetry developed, or what (if any) cosmological effect it had at that time.
The supposition is that the cosmological constant is "zero point energy" of the vacuum, but we have very little idea how to calculate this. The calculation depends on details of the unified field theory that describes the strong, electroweak, and gravitational forces – and we don't know what that theory is. The only "natural" energy scale we have is the Planck energy, which is about 10^{19} GeV.
As if that weren't bad enough, if Λ corresponds to the energy density of a scalar quantum field that is responsible for symmetry breaking, there should be some relation between the field potential and the energy density at the time of symmetry breaking. Unfortunately, this field potential in a typical theory (out of many which have been proposed) is proportional to the fourth power of the energy scale where the symmetry breaking occurs. The net result of such very rough considerations is that the predicted value of Λ could be as much as 10^{120} times what is actually observed.
Clearly, we are nowhere near to having a decent theory that can predict the value of Λ, which means we don't really understand where it comes from. Even if Λ = 0 and some other mechanism accounts for the way that the expansion of the universe is accelerating, we have no way of explaining why the vacuum energy density should be exactly 0.
Other theories have been proposed for the accelerating expansion of the universe, such as a different sort of energy called "quintessence". But we have no evidence for such a thing, so the problem of a cosmological constant, or something much like it, is one of the most significant open questions in physics and cosmology.
That may be difficult. Due to the nature of inflation, most of the characteristics of the universe before inflation began are effectively erased and do not affect the universe as we observe it today. That is why the universe appears to be flat, homogeneous, and lacking in bizarre things like magnetic monopoles.
Nevertheless, there are three or four orders of magnitude on the energy scale between the Planck scale and the beginning of inflation. It is in that region that, somehow, gravity becomes unified with the other forces (or the symmetry between gravity and the others breaks, depending on your point of view). It's hard to suppose that this era has no consequences for "reality as we know it".
There have been many speculative theories proposed to deal with this event. We'll merely list some of the ideas that may be involved.
Copyright © 2005 by Charles Daney, All Rights Reserved