Deriving the Friedmann Equations for Flat Spacetime

William D Clinger

Big bang cosmology is based upon mathematical consequences of Einstein's general theory of relativity. Einstein's theory replaced Newton's law of gravitation by field equations that say the local geometry of spacetime reflects the mass/energy contained within it. Einstein's field equations are nonlinear, hence difficult to solve in closed form, but they do have some exact solutions. The FLRW solutions, named for Friedmann, Lemaître, Robertson, and Walker, were the starting point for big bang cosmology and still serve as its foundation.

My main purpose here is to sketch a derivation of the FLRW solutions for flat spacetime that will allow anyone who can compute partial derivatives to verify for themselves that Einstein's field equations imply the Friedmann equations for flat spacetime.

Readers who work through this exercise will begin to understand Einstein's field equations on the level of symbol-pushing, but I will not try to explain the geometric meaning of the equations. For reasons to be explained later, I will motivate this exercise by using the Friedmann equations to prove these specific facts:

Theorem 1. If a nontrivial FLRW solution describes a spatially flat universe following a big bang, and the cosmological constant is zero, then that universe expands forever.

Theorem 2. In FLRW solutions that describe a spatially flat universe following a big bang, if the cosmological constant is zero, then increasing the pressure components of the stress-energy tensor decreases the rate of expansion.

The derivation that follows will explain how the FLRW solutions provide an objective meaning for the word "expand" even when the universe is both flat and spatially infinite. I will also explain why a spatially infinite universe is compatible with the big bang singularity.


Sources. The derivation that follows is a special case of more general derivations contained within standard textbooks, which are themselves consequences of Einstein's general theory of relativity. In historical footnotes, I will occasionally cite two contemporaries of Einstein as well as Einstein himself.

Footnote.

Albert Einstein. The foundation of the general theory of relativity. This is an English translation of Die Grundlage der allgemeinen Relativitätstheorie, Annalen der Physik, 1916, pages 769–822, which is available online by following links at the SAO/NASA ADS Astronomy Abstract Service. The English translation by W Perrett and G B Jeffery was first published in H A Lorentz et al, The Principle of Relativity, 1923, republished by Dover in 1952, and reprinted in Volume 6 of The Collected Papers of Albert Einstein, edited by Kox, Klein, and Schulmann. It too is currently available online, but the HTML version at Wikisource is more faithful to Einstein's original notation.

Hermann Weyl. Space—Time—Matter. Dover, 1952. This is a republication of an English translation of the fourth edition of Raum Zeit Materie, published in 1921. The first edition was published in 1918.

Sir Arthur Eddington. Space, Time, and Gravitation. First published in 1920, this was a popular account that contained enough equations to make it worth citing.

Charles W Misner, Kip S Thorne, and John Archibald Wheeler. Gravitation. W H Freeman, 1973.

Robert M Wald. General Relativity. University of Chicago Press, 1984.

Steven Weinberg. Cosmology. Oxford University Press, 2008.

Simplifying assumptions. To simplify the derivation, I will assume flat space (which is not the same as flat spacetime), a zero cosmological constant, and a perfect fluid.

Notation. My notation is most similar to Wald's, which is similar to MTW's. Points of spacetime will be labelled by coordinates of the form〈t, x, y, z〉with indices 0, 1, 2, and 3: If x is a point of spacetime, then x0 is its t coordinate, x1 is its x coordinate, x2 is its y coordinate, x3 is its z coordinate. The Lorentz signature is (-,+,+,+) and I use natural units that suppress factors of G and powers of c.

Footnote. Einstein and Weinberg use sign conventions that differ from those used here, and Einstein uses index 4 instead of 0 for the time coordinate.

In journal articles, we're expected to define notations before we use them. If I were to continue to do that, your eyes would glaze over before we ever got to Einstein's field equations. I will therefore state Einstein's equation before I define its notations. Several of those definitions will introduce still more notations, but don't worry: This recursive process is well-founded and will terminate.


Einstein's field equations:

Rμν - ½Rgμν + Λgμν = 8πTμν

Footnote. This is MTW equation (17.11) in component form. Wald equation (4.3.21) is the same but omits the Λ term. Without that term, the left hand side is what we now call the Einstein tensor. In a footnote for §14, Einstein acknowledges the most general form of the left hand side includes the Λ term. In 1920, Eddington's Note 14 stated an equation that simplifies to the above form of Einstein's field equations (including the Λ term), except Eddington wrote Rμν as Gμν and used a sign convention that requires a minus sign on the right hand side as in Weinberg's equation (B.71).

Cosmological constant. Λ is the cosmological constant, which Einstein later introduced because he wanted his field equations to have a static (unchanging, steady state) solution. Astronomical observations tell us Λ must be pretty close to zero. To simplify Einstein's equation, we'll assume Λ is exactly zero.

Isotropy. The cosmic microwave background (CMB) is nearly isotropic: after foreground sources are taken into account, the CMB looks almost the same in all directions. We can therefore simplify our mathematical model of the universe by assuming space is isotropic.

Homogeneity. We can also simplify our model by assuming space is homogeneous: at any given time, the stress-energy tensor Tμν is the same at every point in space. That assumption holds only at large scales, but the FLRW solutions are intended to model the universe only at large scales. I will return to this assumption after we have proved Theorem 1.

Spatial flatness. Astronomical observations tell us that, once we get far enough away from the gravitational fields of nearby stars and galaxies, space (not spacetime) is very close to being flat. Among other things, spatial flatness means the spatial distance between two points of space can be computed using the Euclidean distance function. When computing the spatial distance between two points of spacetime, the Euclidean distance function will be useful only if the time coordinates of the two points are equal (or nearly so): If space is expanding or contracting, the spatial distance function will depend upon the time. To simplify our calculations, we will assume space is completely flat at every point of time, but we will allow the distance function to change over time.

Pseudometric tensor. gμν is the pseudometric tensor, which expresses the (pseudo-) distance between any two points of spacetime in much the same way as the Euclidean distance function computes the distance between two points of Euclidean space. gμν is a tensor field: it may vary from one point of spacetime to another. In particular, gμν may change as space expands following a big bang or contracts toward a big crunch.

Einstein's general theory of relativity tells us coordinate systems are (mostly) arbitrary and spacetime is locally pseudo-Euclidean: For every point of spacetime, we can devise a coordinate system in which the pseudometric gμν coincides with the Minkowski pseudometric at that point and varies only smoothly (continuously and differentiably) within a neighborhood surrounding that point. If space (not spacetime) is flat, then we can devise a coordinate system in which the pseudometric gμν coincides with the Minkowski metric at every point of spacetime that has exactly the same time coordinate as our starting point. To simplify our calculations, we will assume the Minkowski pseudometric holds for every point of spacetime whose time coordinate is today:

ds2 = - dt2 + dx2 + dy2 + dz2

Astronomical observations tell us space is currently expanding. That means the Minkowski pseudometric won't be perfectly accurate tomorrow. To allow for expansion (or contraction) of space, the FLRW solutions introduce a scale factor a(t) whose value is 1 today but may be smaller (or larger) in the past (or future). The square of that scale factor appears within the metric form:

ds2 = - dt2 + a2(t) (dx2 + dy2 + dz2)

For now, you should think of the scale factor as an unknown function whose behavior we hope to understand by applying Einstein's field equations.

For flat space, the nonzero components of the FLRW pseudometric tensor are

g00 = -1
g11 = g22 = g33 = a2(t)

That's the covariant form of the pseudometric tensor, in which all indices appear as subscripts. We'll also need the contravariant form, in which all indices appears as superscripts. To compute the contravariant components of the pseudometric tensor, think of its components as a 4x4 matrix and compute the inverse matrix. All of our nonzero components lie on the diagonal, so the nonzero components of the inverse matrix are just the reciprocals:

g00 = -1
g11 = g22 = g33 = 1 / (a2(t))

Footnote. The discussion above is justified by Einstein's §4 through §8 and the assumptions of isotropy, spatial homogeneity, and spatial flatness. The FLRW metric form is Wald's equation (5.2.5) and Weinberg's equation (1.1.9) (with reverse sign convention). MTW equation (1) of Box 27.1 is the same as Wald's but uses spherical coordinates.

Stress-energy tensor. Tμν is the stress-energy tensor. The modern universe is so empty that we can model it as a perfect fluid for which

Tμν = p gμν + (p + ρ) uμuν

where p is the pressure, ρ is the mass/energy density, and uμ = 〈1,0,0,0〉. Knowing gμν, we calculate

T00 = ρ
T11 = T22 = T33 = p / a2

Footnote. I have begun to write a instead of a(t) because a, ρ, and p are all functions of time (but independent of spatial coordinates, because we are assuming spatial homogeneity) and keeping all of that explicit would detract from readability. The equation for a perfect fluid is Weinberg's equation (B.43), equivalent to Einstein's equation (58). We need the covariant form of the stress-energy tensor, so we will lower the indices of its components using the technique Einstein described in unnumbered equations near the end of his §8 and used to obtain his equation (58a). From now on, I will use Einstein's summation convention in which an index that appears as both a superscript and as a subscript within a single term implies summation over all four values of the index. (This convention was introduced near the end of Einstein's §5.)

We need the components of its covariant form:

T00 = ggTκλ = g00g00T00 = ρ
T11 = T22 = T33 = ggTκλ = g33g33T33 = a2p

Ricci scalar. The Ricci scalar R is defined by contraction of the mixed Ricci tensor:

R = Rμμ = R00 + R11 + R22 + R33

Mixed Ricci tensor. The mixed Ricci tensor is obtained by raising one index of the covariant Ricci tensor:

Rμν = gλν Rμλ

Ricci tensor. The covariant Ricci tensor Rμν is usually defined by contraction from the Riemann curvature tensor, but we can collapse a couple of calculations by using this definition instead:

Rμν = (∂Γλμν)/(∂xλ) - (∂Γλμλ)/(∂xν) + ΓκμνΓλκλ - ΓκμλΓλνκ

If all of the terms implied by Einstein's summation convention were written explicitly, the right hand side of that equation would have 40 separate terms.

Christoffel symbols. The Christoffel symbols that appear within the equation above are defined by

Γλμν = ½ gλκ ( (∂gκμ / ∂xν) + (∂gκν / ∂xμ) - (∂gμν / ∂xκ) )

Footnote. The Christoffel symbols (and related notations) are defined by Einstein's equations (21), (23), and (45), by Eddington's Note 5, by Weyl's equation (56) of §17, by MTW equations (8.24b) and (8.24c), by Wald's equation (3.1.30), and by Weinberg's equation (B.13). Einstein's equation (43) defines the components of the Riemann tensor, from which the Ricci tensor (which Einstein writes as Bμν) is defined by contraction in equation (44). Eddington's Note 5 follows Einstein, but Weyl's equation (60) of §17 defines the Ricci tensor as above, as does Weinberg's equation (B.73). Several of those definitions use sign conventions opposite to the one used here.

Finally, we are ready to calculate. Here's what we need to do:

  1. Calculate the 64 partial derivatives of gμν.
  2. Use those partial derivatives and the contravariant gμν to calculate the 64 Christoffel symbols.
  3. Calculate the 256 partial derivatives of those Christoffel symbols.
  4. Use those partial derivatives and Christoffel symbols to calculate the 16 components of the Ricci tensor.
  5. Compute the 16 components of the mixed Ricci tensor by raising one of its indices.
  6. Calculate the Ricci scalar.
  7. Plug the results of those calculations into Einstein's field equations.
  8. Apply some straightforward calculus and algebra to those equations.

It's not quite as tedious as it sounds. Not quite. Not if you pay attention to the hints. I'll hide the answers so you can have the fun of working them out yourself instead of taking my word for it.

Exercise. Calculate the 64 partial derivatives of gμν.

Hint. All components of gμν are independent of spatial coordinates, g00 is a constant, and all of the nonzero partial derivatives are equal because of spatial symmetry.

(SHOW ANSWER)
Only 3 of the 64 partial derivatives are nonzero:

∂g11 / ∂x0 = ∂g22 / ∂x0 = ∂g33 / ∂x0 = 2 a(t) ȧ(t)

Exercise. Calculate the 64 Christoffel symbols.

Hint. Take advantage of the symmetries. Γλμν = Γλνμ. Once you've computed a Christoffel symbol whose indices involve spatial coordinates, spatial symmetry means every Christoffel symbol whose indices are obtained by applying a permutation of the spatial indices to those in your computed Christoffel symbol will have the same value. Those two symmetries reduce the number of Christoffel symbols whose value you must compute from 64 to 11.

(SHOW THOSE 11)
Γ000 Γ001 Γ011 Γ012 Γ100 Γ101 Γ102 Γ111 Γ112 Γ122 Γ123

(SHOW ANSWER)
Only 9 of the 64 Christoffel symbols are nonzero:
Γ011 = Γ022 = Γ033 = a(t) ȧ(t)
Γ101 = Γ110 = Γ202 = Γ220 = Γ303 = Γ330 = ȧ(t) / a(t)

Exercise. Calculate the 256 partial derivatives of the Christoffel symbols.

Hint. 55 of the Christoffel symbols are zero, and the other 9 fall into two equivalence classes. Because of homogeneity, neither of those functions depends on any spatial coordinate, so you need only differentiate with respect to time. Bottom line: You have only 2 partial derivatives to compute.

(SHOW ANSWER)
In the following equations, i ranges over the spatial indices (1, 2, 3) but does not range over 0. I have returned to writing a, ȧ, and ä instead of a(t), ȧ(t), and ä(t).

∂Γ0ii / ∂t = a ä + ȧ2
∂Γi0i / ∂t = ∂Γii0 / ∂t = ä / a - (ȧ / a)2

Exercise. Calculate the 16 components of the Ricci tensor.

Hint. If you take advantage of the spatial symmetry, you have only 4 components to compute.

(SHOW ANSWER)
In the following equations, i ranges over the spatial indices (1, 2, 3) but does not range over 0. I have returned to writing a, ȧ, and ä instead of a(t), ȧ(t), and ä(t).

R00 = - 3 ä / a
Rii = a ä + 2 ȧ2


SPOILER ALERT
The results of the above calculations will now be used to derive the Friedmann equations without hiding those results.

Before we can calculate the Ricci scalar, we need to calculate the nonzero components of the mixed Ricci tensor:

R00 = gR = 3 ä / a

R11 = R22 = R33
= gR = (1 / a2) (a ä + 2 ȧ2)
= ä / a + 2 (ȧ / a)2

The Ricci scalar is

R = Rκκ = R00 + R11 + R22 + R33
= 3 ä / a + 3 (ä / a + 2 (ȧ / a)2)
= 6 (ä / a + (ȧ / a)2)

The Friedmann equations. To obtain the Friedmann equations, we simply plug the above results into Einstein's field equations and simplify:

R00 - ½ R g00 = 8πT00
- 3 ä / a - ½ 6 (ä / a + (ȧ / a)2) (-1) = 8πρ
3 (ȧ / a)2 = 8πρ

Footnote. That's the first Friedmann equation for flat space, Wald equation (5.2.11).

R11 - ½ R g11 = 8πT11
a ä + 2 ȧ2 - ½ 6 (ä / a + (ȧ / a)2) a2 = 8π a2p
-2 ä / a - (ȧ / a)2 = 8πp

We can isolate the second derivative ä from the first derivative ȧ by adding three times the second equation to the first:

3 ä / a = - 4π (ρ + 3p)

Footnote. That's the second Friedmann equation for flat space, Wald equation (5.2.13).

Multiplying the first Friedmann equation by a2 yields

3 ȧ2 = 8πρ a2

Differentiating:

6 ȧ ä = 16π ρ a ȧ + 8π a2 ρ̇

where ρ̇ is the derivative of ρ with respect to time. (Some browsers may place the overdot to the right of the ρ.)

Multiplying the second Friedmann equation by 2aȧ yields

6 ȧ ä = - 8π (ρ + 3p) a ȧ

Subtracting that from the previous equation and simplifying:

ρ̇ = 3 (ȧ / a) (ρ + p)

Footnote. That's Wald equation (5.2.18) and Weinberg equation (1.5.20).


Proof of Theorem 1. The first Friedmann equation is

3 (ȧ / a)2 = 8πρ

The right hand side of the equation never drops completely to zero, and the left hand side is the square of a real number, so the left hand side is always positive. The numerator of the fraction can therefore never become zero. If that numerator starts out positive, it must remain positive. According to the hypotheses of Theorem 1, the rate of expansion (the derivative of the scale factor a, and the square root of the numerator) does start out positive. Since it never becomes zero, it must remain positive.

It should be said, however, that the rate of expansion may converge toward zero. The real importance of Theorem 1 is that a flat FLRW universe that follows a big bang will never collapse into a big crunch.

It should also be said that the real universe contains inhomogeneities, which can be expected to increase over time. If inhomogeneity causes some region of space to exceed the critical density at which space is flat, that region might collapse toward a crunch while other regions of space continue to expand.

Proof of Theorem 2. The second Friedmann equation is

3 ä / a = - 4π (ρ + 3p)

Both sides of this equation are negative, so the second derivative of the scale factor is negative. That means the first derivative of the scale factor is decreasing, which means the rate of expansion is slowing. Increasing the pressure p makes the right hand side more negative, so it must decrease the rate of expansion even more.

The meaning of spatial expansion in a spatially infinite universe. In FLRW solutions, the scale factor a(t) provides an objective measure of expansion and contraction. When a(t)=1, the universe is the same "size" it is now (because we arbitrarily set a(t)=1 to match today's conditions). If a(t)=⅓ at some point of spacetime, then the universe at that time is said to have been only a third its size today, even though the universe was spatially infinite both today and then. This notion of size may sound arbitrary but isn't: it's tied to the pseudometric tensor, which is tied to the geometry of spacetime.

How can the universe become infinite if it started out as a single point? Although journalists and popularizers may try to explain the big bang by saying the entire universe started out as a single point, that's not what mathematicians and physicists mean by a big bang singularity. The word "singularity" means our current understanding of the mathematics and physics is not sufficient to tell us what goes on there, if indeed there is a there there.

In particular, the FLRW models with flat space do not start with a pointlike singularity and suddenly blossom into spatial infinity. The FLRW models actually extrapolate backwards and forwards from current conditions. The big bang singularity sets a limit to how far backward we can extrapolate using FLRW models of the universe. We may be able to extrapolate arbitrarily close to the big bang singularity, but we can never reach it. That is why the FLRW models do not have to explain how a point-like universe suddenly becomes infinite. The big bang singularity is a mystery, not a point.


This is a slightly revised version of a calculation I posted at the JREF Forum on 30 April 2014.

Last updated: 6 May 2014.

For debugging: Click here to validate.