All Downhill From Here!

Although I've posted previously specifically on the second law of thermodynamics, there were some features of it that I didn't discuss. This is largely due to the fact that the original iteration of the essay on this topic was written quite some years ago for a different venue. 

In that previous post, I discussed at length what the second law of thermodynamics is, what it says, and how it applies to different types of thermodynamic system. It was written to directly address the false conflation of entropy with disorder, as this is the basis for a lot of the claims against evolutionary theory.

Here, I want to explore some of the more subtle aspects of entropy as it pertains to evolution and, indeed, life. I'm going to begin with a statement that many, even some with a reasonable understanding of entropy and/or evolution, will find quite counter-intuitive. However, if I marshal my facts and arguments correctly, I'm confident that it will become clear.

So, what is this counter-intuitive statement? It's simply this:

Life and evolution, rather than being a violation of the second law of thermodynamics, are actually manifestations of it!

In a nutshell, entropy is the tendency for thermodynamic systems to move toward their lowest energy state. They do this via equalisation of differentials which, in physics, is called doing work.

Let's start from first principles, as there will be some good examples as we go of how entropy works.

As we've noted in earlier posts, one of our best current sources of information about the early universe is the cosmic microwave background radiation (CMBR), discovered serendipitously by Penzias and Wilson in 1964. What the CMBR actually represents is not, as I've seen suggested, the glow of the big bang, but the photons that come to us from the 'last scattering surface'. This is going to serve as a useful pointer to what we're discussing here, so it's worth spending a little time on.

For about the first 380,000 years or so after the Planck time - a theoretical construct dealing with the smallest useful amount of time after the 'beginning' of expansion - the cosmos was opaque to photons. The reason for this is that, after the onset of expansion, the cosmos was extremely hot and dense. So hot, in fact, that it was basically a white-hot plasma of photons and ionised hydrogen, which consists of protons and free electrons. Because of the abundance of free electrons at such temperatures, the distance that photons could travel freely was massively restricted due to Compton scattering. 

Now, anybody who likes to smell nice knows what happens when a body of particles expands, because we've all felt the deodorant can cool down as we let out the smellies. The same thing happens with the cosmos. As expansion continues, the cosmos cools. Prior to a certain low temperature, electrons remain free, not bound to protons, because of a quirk of entropy, namely that, in the environment they find themselves in, there's nowhere for energy to go to become unavailable. As the termperature gets below about 4,000 Kelvin, however, something interesting happens; there's somewhere for energy to go, so the free electrons begin to bind to protons, forming the first neutral atoms. The reason they do this is because it's energetically favourable for them to do so. This is just another way of saying that, once they can shed some energy, their lowest available energy state consists of being bound to protons. 

As an aside, this is where the CMBR comes to us from. By the time the temperature gets as low as 3,000K, most of the free electrons have become bound, and the photons from the last bit of Compton scattering are now free to travel through the cosmos, and it's these photons that we detect as the CMBR. Due to the expansion of the cosmos, these photons are hugely red-shifted, meaning that their wavelengths are stretched out, until their temperature is about 2.7K, or -270 degrees Celsius, a smidge less than three degrees above absolute zero.

So we have a cosmos filled with neutral hydrogen, along with some helium, a bit of lithium and trace amounts of beryllium. An atom has less mass than its constituent particles, and remember that m=E/c2, so its mass and its energy are the same thing. What's been shed in the binding of the electrons to the protons is known, somewhat misleadingly, as 'binding energy', because from here on in you have to input energy to break them apart. 

The main point here is that, because of entropy, the tendency for things to find their way to their lowest energy state, we can end up with a situation that seems slightly counter to our cursory understanding of entropy.

How about a more complicated scenario? One of the things we covered in the earlier post was black holes, which are the highest entropy entities in the cosmos (despite also being the most highly ordered). This is because, in the context of gravity, clumping of matter constitutes an increase in entropy. Thus, after the initial period of expansion, and possibly due to quantum fluctuations in the plasma during expansion, some areas of the CMBR are slightly cooler than others. These anisotropies (lack of sameness) represent areas of slightly lower energy, which also means lower mass in a given are. As discussed before, these differences represent only a few ten-thousandths of a degree, but it's enough for clumping to begin under gravity.

So, our hydrogen et al begin to clump and, where there is enough in one place, they clump tightly enough under gravity that the temperature begins to increase again. Once again, this looks like a violation of entropy, but it's once again a manifestation of it. Once there's sufficient mass in a small enough volume, fusion begins. This is the next stage of our explanation.

It's well-understood that fusion is a pretty complex process, because it's incredibly difficult for the nuclei of two atoms to get sufficiently close together to bond under the strong nuclear force. This is due to an energy barrier arising from electrostatic interaction between nuclei. Occasionally, this barrier, known as the Coulomb barrier, after French physicist Charles-Augustin Coulomb, will be overcome as a matter of statistics, but not nearly sufficiently often for stellar fusion to be a reliable process. For that, we need our old friend quantum tunnelling, discussed in another post. Quantum tunnelling allows hydrogen nuclei to overcome the Coulomb barrier, which in turn allows hydrogen to fuse into helium much more readily. 

This particular process is extremely interesting for our purposes here because, once again, the mass of the helium atom is less than the two hydrogen atoms independently, even though they have exactly the same constituents. We can see this quite easily. We have 2 protons, at 1.673 x 10-27 kg each, 2 neutrons at 1.675 x 10-27 kg and 2 electrons at 9.109 x 10-31 kg. The helium atom has a mass of 6.646 x 10-27 kg. We don't even need to get the chalk out to see that just the four nucleons, at a total mass of 6.696 x 10-27 kg, add up to more than the entire helium atom, without even bringing the electrons into the mix. This energy is, of course, shed as photons which, in the case of Sol, our sun, we see as sunlight.

So, here we have a process in which, locally, entropy increases by undergoing reactions. In this particular case, we're also talking about overcoming a barrier to attaining the lower-energy state. This is only a partial, short-term solution (albeit on the scale of billions of years), however, because the entropy can still increase. The point here is that the energy barrier can be overcome to find an even lower-energy state.

There's another process that overcomes energy barriers to reach short-term lower-energy states, and I'm reasonably sure that you should be able to see where this is going by now. We can follow this exact chain of reasoning all the way through the periodic table, and indeed on beyond it into chemistry. Quite often, we'll find that there's a slight energy barrier to overcome, such as when hydrogen bonds to oxygen. We've all seen, I'm sure, an example of a hydrogen explosion, whether at the hands of a maniacal physics teacher or via footage of the Hindenburg disaster. Here's a nice little video from Periodic Videos:

 

In this video, we can see a hydrogen-filled balloon being exploded. Hydrogen naturally exists in a molecular state as H2 at the range of temperatures we think of as normal, and is highly flammable, and really wants to bond with oxygen, two atoms of hydrogen to one atom of oxygen, to make water, H2O. There is a tiny energy barrier to overcome, and this is facilitated by inputting a little bit of energy, in this case, with an electric match. Once the first hydrogen molecule has bonded, the binding energy released will go on to give a kick to several more, and so on, so there's sufficient energy to trigger an exponential chain reaction until all the hydrogen is bonded to oxygen, leaving water. The additional energy is released as sound, heat and light.

So, we've seen that, starting from first principles, the tendency toward lower energy states appears at every point and in every process, not just in degeneration, but also in generation, in processes in which locally entropy can be increased by a small amount in lieu of greater increase in entropy later, but what about life and evolution?


Well, these are simply continuations of the self-same processes. Chemicals react to each other, seeking ways to shed a little bit of energy at a time. Over time, these processes result in something that self-replicates, entirely in line with - and a manifestation of - this exact tendency toward shedding energy, helped along by energy input from the sun, where entropy is increasing all the time as it loses energy in the form of electromagnetic radiation. Once this has begun, especially where this replication is imperfect to any degree, it's a simple matter of generations to find bigger agglomerations that locally find the lowest possible energy state. Eventually, these chemicals organise into cells, which are incredible generators of entropy, and these form colonies of ever-greater complexity, ever better at locally tying up small amounts of energy while the whole system equalises.


Ultimately, as the expansion of the cosmos continues and, in fact, accelerates, all those little bits of energy tied up will eventually be broken apart, and all that will be left is a sea of photons, cooling and expanding, asymptotically approaching absolute zero.


All of the above, the entire history of the cosmos, nucleosynthesis, gravitational accretion of stars and solar systems, life, evolution and the eventual heat-death of the cosmos, all driven by a single process, encapsulated in a single simple equation that can be written on the back of a cigarette packet:


Image courtesy Mike Drippe Sr.


 
One final note, just to clear up a common misconception about heat death that I've seen bandied about.

I've seen it said or implied that heat death will be the point at which the cosmos reaches absolute zero. This can't actually happen because, like the singularity, absolute zero is an asymptotic value, meaning that it can be approached but never reached. Heat death can occur at any temperature. As stated right at the top of this post, work is the equalisation of differentials. Heat death occurs when there are no more differentials left to equalise, because there will be no usable energy.

Hope this has been enjoyable and informative. Nits, crits and corrections, as always, extremely welcome.

Share and enjoy.

Are Babies Atheists? Semantics and Communication.

It often occurs in certain discussions that atheism must adhere to this or that definition, and that any definition other than the one being supported by X is a dishonest attempt to evade the cogent points of arguer Y. These discussions only ever crop up in particular circumstances, generally in the sorts of venues where atheists and theists gather to debate.

In this post, I want to talk about what I mean when I say I'm an atheist, and to support my particular usage. It's going to be, quelle surprise, a post about semantics.

Semantics, oft-maligned in discourse, generally as a means of dismissing an argument or a position, is an extremely important discipline, and forms the basis for a huge amount of philosophical discourse. Semantics deals with what we mean when we say a thing. As such it's the very heart of communication.

The vast majority of philosophy (of which science is a subset, albeit the only one that's ever provided substantive knowledge of the universe) is concerned with semantics. Read any discourse on philosophy, and you'll find that a considerable amount of the time is spent defining and justifying terms.

Semantics is among the most important disciplines in philosophy. Dismissing an argument on the basis that 'it's just semantics' is, therefore, and to employ a favourite footballing analogy, the equivalent of diving in the penalty area. It's a cheat. A lazy cop-out indicating a lack of wherewithal to deal with an opponent's argument or the intellectual honesty to simply admit it. 


So, what does it mean to define something? Generally speaking, defining an entity means collating all the attributes that describe an entity, and working out which of them are both necessary and sufficient. These terms are fairly straightforward and should require no further explanation.

In the case of atheism, it is sufficient for an entity not to accept (regardless of whether or not they're actually capable of accepting) a specific class of truth-claim with regard to the existence of deities. Any entity that does not accept said truth-claims can reasonably be described as an atheist. It is also necessary, because if one does accept these truth-claims, one is a theist. This definition describes ALL positions that fall under the rubric of atheism, and excludes ALL that do not. Thus it defines what it is to be an atheist.

One of the common objections to this as a term is that, under such a definition, bricks and babies would be atheist, and it seems absurd to employ such terms to bricks and babies.

The simple fact is that bricks, electrons, sperm, eggs, are all atheist. It only appears to be silly to describe them as such because those are not useful concepts to apply to them. Everything in the universe that does not have an active belief in a deity is atheist, but most of the things in the universe simply couldn't possess a concept such as a deity, so to actually describe them as atheist is meaningless. This is because, absent a conception of a deity, the term itself is meaningless. One doesn't actually need to have encountered or formed an opinion on the concept of a deity to be atheist, because the term itself is a privative, as denoted by the privative prefix 'a', meaning that it's a word describing the absence of something. We say that it only applies to thinking entities because the opposing referent only applies to thinking entities. Indeed, it's not stretching the point to state that, absent theists, applying the term 'atheist' to humans would be equally absurd.

Atheism is simply the non-acceptance of a specific class of truth-claim with regard to the existence of deities. One doesn't have to be a conscious, thinking entity to qualify, and all entities that are incapable of forming an opinion, including bricks, are thus defined.

Is a brick bald? Of course it is, because it doesn't have hair. It doesn't have to be capable of having hair to qualify as bald.For it to be described as bald only requires that it be devoid of keratinous filaments.


Now, some apologists will say that this can't be correct, because if somebody tells you that they are atheist, it tells you that they've thought about it and arrived at a specific position. This particular fallacy, a form of category error, is a simple confusion of map with terrain. Certainly somebody that tells you they're atheist has given it sufficient thought to have arrived at this, but atheism doesn't require the ability to tell somebody you're atheist, only that you don't believe.

Another common objection is to appeal to a dictionary, which is a common twin fallacy I like to call the argumentum ad lexicum. The two fallacies being committed here are the argumentum ad populum and the argumentum ad verecundiam. This is because a) a dictionary only denotes popular usage, and b) it isn't a valid authority on what words mean, not least because dictionaries are descriptive, not prescriptive. Aside from other considerations, if dictionaries were prescriptive, every word since Dr Johnson would be fixed, and future evolution of language would not be possible. This is clearly absurd.

Moreover, while many dictionaries do indeed give the definition of atheism as 'the belief that no gods exist' or some equivalent, they also generally contain the definition 'absence of belief or disbelief in the existence of deities' or some such, meaning that even applying the argumentum ad lexicum doesn't make the case for the more rigid definition.

On the other hand, the one attribute shared by the entire set of all entities that can be described as atheist is non-acceptance of a specific class of truth-claim with regard to the existence of a specific set of entities. This attribute is both necessary to be described as atheist (if you accept this class of claims, you're a theist) and it is sufficient (no further attributes are necessary).

Of course, there are various positions regarding what type of atheist one is, including the position that no gods exist. Such a position is still described as atheist, but now requires a qualifier to fully define the position.

Another important term, and one of the aforementioned qualifiers, is 'agnostic', often treated as some sort of middle-ground between theism and atheism, and defined as 'not knowing'. In the context of belief in a deity, such a definition would be redundant in the extreme, because nobody knows. Agnosticism, in sensu Huxley, who coined the term, is the position that knowledge is not possible. Agnosticism is, therefore, a position concerning the possibility of knowledge. 

For myself, I vacillate between gnosticism and agnosticism, depending entirely on the conception of deity being presented. For any interventionist deity, for example, I'd expect evidence of their intervention to be discoverable. Further, for specific conceptions of deity, such as the absurd entity described in the hokey blurble, I consider myself a strong, gnostic atheist. That entity does not - and cannot - exist.

There are those on the atheist side of the fence, most notably biologist PZ Myers, who suggest that atheism should mean more and, indeed, that atheists who argue on the basis of such a definition are doing a disservice to atheism. I can sympathise with this assessment, not least because of Myers' particular situation, as a professor of evolutionary biology who often has to defend his discipline against attack from supporters of well-funded organisations with significant political clout whose sole purpose is to undermine evolutionary theory in the fallacious belief that this will somehow confer validity on their myths. I entirely understand his position and, being something of a didact and somewhat concerned with the scientific education of the world at large, I share many of his concerns. However, I don't share his view on the definition of the term itself. Here's a portion of Myers' post, a follow-up to a talk given in Montreal:

If I ask you to explain to me why you are an atheist, reciting the dictionary at me, you are saying nothing: asking why you are a person who does not believe in god is not answered when you reply, “Because I am a person who does not believe in god.” And if you protest when I say that there is more to the practice of atheism than that, insisting that there isn’t just makes you dogmatic and blind.

This comes back to my earlier point about the distinction between an atheist and somebody who's given the matter sufficient thought to be able to self-identify as an atheist.

He continues:

In that Montreal talk, I explained that there is more to my atheism than simple denial of one claim; it’s actually based on a scientific attitude that values evidence and reason, that rejects claims resting solely on authority, and that encourages deeper exploration of the world. My atheism is not solely a negative claim about gods, but is based on a whole set of positive values that I will emphasize when talking about atheism. That denial of god thing? It’s a consequence, not a cause.

And this is all well and good, and tells you about Myers' own journey to atheism, but this is not reflective of a large group of people whose journey was different. 

He goes on:
My point is that nobody becomes an atheist because of an absence of values, and no one becomes an atheist because the dictionary tells them they are. I think we also do a disservice to the movement when we pretend it’s solely a mob of individuals who lack a belief, rather than an organization with positive goals and values.

Here we see the real issue I have with this entire approach. Thing is, I never did become an atheist. I always was one. I never believed. For the record, I tried really hard to believe when I was young. I come from a very large Catholic family, and it seemed really important to all the people I cared about that I'd have an eternal life which, in their esteem, could only come about by accepting Christ as my saviour. I could never suspend disbelief sufficiently to be able to accept the central claims of Christianity. They were simply never believable to me. My scientific attitude and my valuation of evidence and reason, which I share with Myers, came later. Was I not an atheist until I held such attitudes? 

If I was born in London, and Myers got on a plane travelling to London, are we not both still in London?

I know that, while I'm certainly in the minority among atheists, my experience is definitely not entirely in isolation, not least because I know many atheists who've been atheists since birth, and have never believed. It's also worth noting once again the backdrop of Myers' atheism, because that informs his position.

As for the 'movement', it's difficult for me to tell whether I'm part of it. I certainly consider myself an activist in support of a scientific world-view, and have spent considerable portions of the last decade and more advocating against religious nonsense, and have garnered something of a reputation for being merciless to bad ideas generally, sharing Myers' idea that we should be organised with positive goals and values. However, these are separate from my atheism, which is neither a consequence nor a cause, it's a simple privative. These values are my love of science and reason, which inform my activism as an atheist, but have no impact whatsoever on my atheism itself.

There is another group of people who self-identify as atheists and who insist that atheism should be more, largely, I think, because of a need to feel superior. I have no such need (and I don't classify Myers in this group, to be absolutely clear).

In short, atheist isn't something I am, it's something I'm not.

Note: I'll shortly be compiling a glossary detailing common sources of equivocation and how I use the terms. They're not intended, in the same way that none of the terms I've defined anywhere in this blog, to serve as prescriptive definitions that everybody must use, but serve only to avoid ambiguity in what I say. I'll get this project completed once I have the five or six posts currently in development out of the way.

Scale Invariance and the Cosmological Constant


I've been asked by a Twitter friend to do a layman's précis of a couple of papers published in the last couple of days by André Maeder of Geneva University. As they deal with some things we've covered that need some fleshing out, I thought I might as well address them in a new post.

First, as always, a bit of background. We've discussed in a previous post how Einstein's general theory of relativity, published in 1915, strongly implied that the cosmos couldn't be static, discovered by Georges Lemaitre, and independently by Alexander Friedmann. It had either to be expanding or contracting, because there was nothing to hold the cosmos open against the gravitational attraction of all the matter in it. If it was static, then the mass should cause it to contract. Lemaitre concluded that the cosmos must be expanding and formulated the first of a general class of theories now known as 'big bang'.  Lemaitre's own idea was something he called the 'cosmic egg hypothesis'.

The majority of physicists of the time - Einstein included - were strongly of the opinion that the cosmos was static and eternal, so Einstein modified his equations to make it static by adding a term, the Lambda term (Λ), which he called the Cosmological Constant. Later, Edwin Hubble nailed the case shut (we think) when he observed that the universe was larger than our galaxy alone, and that it was expanding. Einstein removed the term and deemed the cosmological constant the worst blunder of his career. It was a blunder, though, despite Einstein's fans waxing lyrical. There was no good reason to insert the term other than to fudge the result based on a prejudice. This term, fixed by Einstein, denotes the rate of expansion or contraction via relation to its energy density. Properly, it's a variable, but one that had been assigned a fixed value by Einstein.

Then, in the 90s, astronomers were thinking about how, with all the mass in the universe, expansion should be slowing down, so they set out to measure it. They used a very clever trick to do so. 

There's a well-understood phenomenon in the evolution of stars. Stars below a certain mass, the Chandrasekhar Limit, will always go through the same final stages in their evolution. Our sun is one such star. Initially it will swell to a red giant, and then it will shed some of its mass and shrink down to a white dwarf. The Chandrasekhar Limit is the maximum mass a star can have and still undergo this fate. Beyond this limit, it will either become a neutron star after collapse or, if it has sufficient mass, a black hole. The Chandrasekhar limit is about 1.4 solar masses.

Now, the thing about a white dwarf is, yer basic feature of a white dwarf is that it has a maximum mass, something that causes a marvellous effect where you find a white dwarf in, for example, a binary system. A white dwarf can still accrete mass if there's sufficient mass in its vicinity. Often, in binary systems, especially, a white dwarf can actually gain mass sucked off its partners surface. This means, of course, that it breaches the Chandrasekhar Limit and goes nova, resulting in a neutron star. Because this breach always occurs at the same mass, these particular supernovae, known as type 1A supernovae, always shine with the same intrinsic brightness. By calculating its apparent brightness, and doing some clever sums, we can work out how far away it is or, more accurately, how far away it was when the light left it and, by corollary, how long ago it left. By taking multiple observations of the same supernovae, we can calculate its red-shift, which tells us how quickly it's moving away from us (actually, it's not moving away at all, it's sitting still while the space between us expands).

When we put all this together, we can actually work out how fast the cosmos was expanding at different points in time.

So our two teams of astronomers, Saul Perlmutter and Adam Reiss in the US, and Brian Schmidt in Aus, set out to measure the rate of expansion, hoping to come away with a figure on how much expansion was slowing.

To their great surprise, observations showed that it wasn't slowing at all, but it was actually accelerating in its expansion! Indeed, it had started accelerating about four billion years ago, almost ten billion years after the Planck time.

What this means in a nutshell is either that a) one side of the equation for general relativity is wrong and we don't understand gravity as well as we thought we did, or the other side is wrong and there's some energy/mass in the universe we can't account for. Broadly, this effect is called dark energy (although dark energy is also the name of one of the models attempting to explain it). 

In any event, we now need a term that deals with the rate of expansion, because it's now known to be variable. Where can we find something like that?

Ah, the cosmological constant is just such a term! Inserting the Lambda term again allows us to model the evolution of the cosmos over time, adjusting for the rate of expansion in a dynamic way, and we call it the cosmological constant as a matter of historical contingency.

Now, because we can now see that the expansion is accelerating, it means that there is some 'force' acting to overcome the attraction of all the mass in the cosmos. It may well be that this effect is simply gravity, which is known to be repulsive under some solutions of general relativity.

What about dark matter? 

Dark matter is again a placeholder. It's the name we give to an effect that doesn't match our expectations regarding the orbits of stars in the outer edges of galaxies, especially small galaxies that rotate quickly, because given the amount of matter we can actually detect, these stars are travelling too quickly to be gravitationally bound to their galaxies. In other words, they're moving so fast in their orbits that they should be escaping. That means that either there is something wrong with our picture of gravity, or there's something there that we can't see that's exerting sufficient gravitational influence to keep those stars in their orbits. Whatever the solution is to these anomalous observations, the effect, which has been observed, is called dark matter. The term itself is another matter of historical contingency, arising from confidence that our model of gravity is largely correct, stemming from the fact that it's withstood huge amounts of testing. 

The proposed solutions are many and varied, including WIMPs (Weakly Interacting Massive Particles), MACHOs (Massive Compact Halo Objects - because the effect is most observed in the haloes, or outer edges, of galaxies), an incorrect understanding of gravity, and ordinary matter residing on an adjacent brane, and ALL of them are called dark matter, because dark matter is simply what we call the observed effect.

Another important idea we need to touch on is 'scale invariance'. This does pretty much what it says on the tin, and deals with what we see at different scales in terms of difference. Dirac put it nicely, as cited in the first of the Maeder papers: 

It appears as one of the fundamental principles of nature that the equations expressing basic laws should be invariant under the widest possible group of transformations.
And the author goes on to point out that Maxwell's equations, for instance, are scale-invariant. There are many applications of scale-invariant functions in pure and applied mathematics. For example, it's an important factor in such things as probability distributions, fractal geometry

It didn't have to be this way, of course. The author notes that Galileo had inferred that the laws of physics generally vary with scale, from observations that "the strength of materials were not in exactly the right proportion to their size".

One thing we know is not scale-invariant is general relativity because there's a natural distance scale, in both distance and time, when a system has non-zero mass.

Anyhoo, the two papers are these, accepted by Arxiv for publication on 23rd May.

Scale invariant cosmology I: the vacuum and the cosmological constant.
Scale invariant cosmology II: model equations and properties

In the first paper, Maeder is proposing that a gauge transformation relates the Λ term in GR and the λ term in scale-invariant mathematics via the equations given in the paper, meaning that scale-invariance for empty space can enter a model with a cosmological constant.

The second paper continues and shows the derivations of the equations that relate the two frameworks.

I haven't unpacked the equations, but they're certainly interesting ideas, and could lead to a simplified approach to finding more classes of solutions to Einstein's field equations, which are notoriously difficult to find, which is why we still use Newton's theories for most everyday applications.

As always, I look forward to any nits, crits, corrections and suggestions.


Further reading:

Before the Big Bang Part II

Has Evolution Been Proven?

What does it mean to 'prove' something? What is proof? What would it mean to 'prove' evolution?

Often, one comes across the claim that evolution has never been proven. Let's first be clear about what this means. 

Proof is a formal procedure applicable to axiomatically grounded systems of deductive logic. Here's an example:

Axiom: The addition of two integers gives the sum.
Proof: 1+1=2

Seems pretty straightforward, no? Would that it were that simple.

The most ready application of this process is in mathematics, of course, which is built from the simplest axiomatic foundations in which all the core axioms are definitionally true, because of the way the requisite entities are defined. 

Is evolution a deductive system of logic? Is science?

Science, being primarily an inductive discipline, doesn't generally use proof except in very strict, very specific circumstances, one of which is at the heart of scientific reasoning. I've previously posted explicitly on this topic, so I won't explore it further here.

What has been done in the case of evolution is something that, while not commensurate with this procedure, is something every bit as solid and unassailable: It has been observed occurring!

Yes, you read that correctly. We've actually watched it happening. Not merely adaptation - which is still evolution, incidentally - but full-on evolution, and macroevolution at that.

This post will deal with some specific instances documented in the literature, including one beautiful and, to those unaware of it, quite surprising instance, but I won't spoil it yet.

Some unpacking of terms is warranted here, because they get thrown around an awful lot, mostly by those who don't understand how the terms are used in the primary literature, but to such a degree that they even confuse the scientifically literate to the point that the stock responses don't really deal with the underlying issues.

Let's start with a common term that comes up in discussion with apologists for creationism, and that doesn't appear in the literature at all: Kind.

This is a quite beautifully distracting elision, erected specifically to bring the whole argument back to the bible. It's rooted in Genesis 1:25.

And God made the beast of the earth after his kind, and cattle after their kind, and every thing that creepeth upon the earth after his kind: and God saw that it was good.


Of course, the one thing they can never do is to define just what is meant by a 'kind'. It should be a simple matter, amounting to no more than pointing at where it comes in the phylogenetic hierarchy employed in the relevant fields, an example of which is on the left. 

Some things worth noting about this tree as I've drawn it. First, I've only included one branching per generation of the hierarchy, while in fact there may be many. 

Also, and quite importantly, in the 'domain' section of the hierarchy, it looks a lot like Eukaryota and Archaea are the parents of the illustrated kingdoms. This is not correct (as nearb as we can currently tell). In fact, the kingdoms shown descend only from the Eukaryota, which branched earlier from the Archaea. 

It's again apposite to note that each of those boxes constitutes a population of organisms, whatever the level because, and I can't stress this enough, evolution is a population phenomenon.


Now, it's worth pointing out that this particular system is being used less and less, and is being replaced with the more robust cladistic system. A clade is any species and all its descendant species. The reason for this should be reasonably clear, and it's all to do with how those branchings (known in the jargon as 'divergence') actually work. For example, each of those levels in the hierarchy was once a species, which means that the entire phylogenetic system is a bit of a moving feast. Further, if you look at the top of the tree, where Homo sapiens resides, there's an empty space next to it, illustrating a possible future divergence. If and when that happens, there might be a bit of a problem. When dealing with other species we discover around the animal kingdom, we don't really have issues classifying them but, if our species were to diverge, which branch of the divergence would be entitled to the appellation Homo sapiens?

Anyhoo, the main point here is that 'kind' has no place in this system. What about cladistics? Does this offer any solace for 'kinds'? 

Not remotely. Cladistics actually makes the problem worse for 'kinds', because at its heart is a macroevolutionary process, and that's entirely the problem for creationists, because they deny that this even occurs.

One of the problems I've encountered an awful lot is the literate, in dealing with some of the claims propounded by deniers, is misunderstanding of just what macroevolution is. I've come across responses such as 'macroevolution is just lots of microevolution', and 'macroevolution isn't even a valid term, there's just evolution'. Neither of these is correct. Before we can reasonably deal with why that is, there's another problematic term that we need to deal with: Species.

What exactly is a species? This is a question that's troubled evolutionary biologists for a long time, and only in recent years has some resolution been found. It isn't that defining a species is problematic in and of itself, it's rather that most ways of defining a species raise problems if applied too rigidly. The most commonly-accepted definition currently in use is known as the 'biological species concept' (BSC). This defines a species as a population of organisms throughout which gene flow occurs at a given moment in time.

Now, even this, if applied too rigidly, can be problematic. For example, viral vectors constitute gene flow, and there's no good reason to infer that, therefore, the Ebola virus is human. That would be absurd. Properly applied, though, the BSC is extremely robust, and tends to be problem-free. 

Gene flow is fairly straightforward, though it does have some implications, which we'll come to shortly. You'll note, however, the last part of that definition 'at a given moment in time'. This is one of the most important features of the BSC. Indeed, any conception of species that doesn't contain a temporal component of this nature is rapidly going to run into trouble. This is because, absent such a feature, any conception is going to come hard against what I like to call 'the discreteness problem'. Richard Dawkins termed it 'the tyranny of the discontinuous mind'. It deals with our need to classify things and put them in little boxes, overlooking the fact that nature rarely works in such a digital manner. Having this component allows us to classify away to our heart's content without running into this issue. What this allows us to do is to properly treat evolution as what it is, namely a population phenomenon. Evolution only applies to individual members of species in terms of their contributions to the gene pool. Any species must be defined at a given moment.

To clarify that, think about a group of humans. For simplicity, we'll work with the premise that the population size remains constant, so that each pairing gives rise to a pair, and the death rate keeps pace with the birth rate.

Now, each pair of humans will give birth to progeny that are the same species as their parents. This must be the case. There are differences in their genetic make-up, not least because each of the offspring is a different blend of the genes from both their parents, but they are still the same species, because gene flow is occurring between them. Now, here's one of those places where the BSC can lead to absurdity if applied to rigidly because, once past reproductive age, there is no gene flow, so anybody over a certain age will not be classified as the same species. That's an obvious problem, though, and can reasonably be discounted.

Anyway, our population is cycling, giving birth and dying, and we can let this go on for as many generations as necessary. Now let's imagine that, after lots of generations, one of our population invents a time machine. It goes right back to where we started, say 30,000 generations before. Is it the same species as the original population it met? We've been calling them humans the entire time the experiment has been running, and indeed we could even use the proper binomial for them, Homo sapiens (binomial means 'two names', and is the technical term for any species name, which should always be italicised, with the genus capitalised and the specific name lowercase) . So, they're both H. sapiens, but are they the same species? In all likelihood, they aren't. 30,000 generations is certainly sufficient time for enough genetic differences to have built up to make them biologically incompatible, so they're different species, but they're described by exactly the same binomial. That's why we need the temporal component in our species concept.

So, now we've laid the groundwork, what about those two terms, micro- and macroevolution?

The macro/micro distinction is a valid distinction in evolutionary biology, but it doesn't mean what the creationists think it means.

Evolution is defined as variation in the frequencies of alleles, where an allele is a specific iteration of a given gene. An easy way to understand what an allele is is to think about insulin.


Insulin is a critical molecule for almost every vertebrate. Deficiency in insulin production is very common, the disorder being known as diabetes mellitus.

The gene coding for insulin has been extensively studied for a fairly wide range of organisms. We can look at the sequences of two closely-related organisms to show what an allele is. Here's the gene coding for insulin in humans, which can be found on chromosome 11:

atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gca gcc ttt gtg aac caa cac ctg tgc ggc tca cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc att gtg gaa caa tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag

Which codes for the following insulin precursor:

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN


Here's the same gene coding for insulin in lowland gorillas, with the differences highlighted in red:

atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gcg gcc ttt gtg aac caa cac ctg tgc ggc tcc cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc atc gtg gaa cag tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag


And here's the insulin precursor:

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN


As you can readily see, even though the gene has differences, the precursor is identical. There's an important point there about how genes work, which I'll come back to shortly. Meanwhile, if you want to see more on this, please see the excellent post on the fallacy of one true sequence by Calilasseia, which deals with a great deal more than just these differences.

The main point here is that these different genes are known as alleles, because they are different versions of the very same gene. That said, there are many, many instances in which different organisms carry exactly the same allele. For example, if you compare the gene coding for histone in humans and chimpanzees, you'll find that they're identical.

Microevolution is defined as variations in the frequencies of alleles below species level, in a population of organisms. Macroevolution is defined as variations in the frequencies of alleles at or above species level, or in populations of species. 

In short, macro isn't merely lots of micro, because macroevolution goes on with every variation of alleles that are shared. When a chimpanzee gives birth and its offspring is carrying an allele that is shared with humans, that's macroevolution.

There are other examples of macroevolutionary processes at work that are not necessarily well-understood, such as extinction, in which the frequency of all alleles in a species go from 'some' to 'none'. Because this variation is happening at species level, it also constitutes macroevolution. Another is fixation, a process in which, through genetic drift, a given allele gets distributed in such a way that every extant member of a species is carrying it. This is again variation in frequencies at species level.


So, we talked above about how a species is defined, and we talked about divergence of populations. This is known as 'speciation'. In a nutshell, this is when a single population gets fractured reproductively, generally due to some geographic barrier, so that they are accumulating different sets of alleles. While they're a single population, the constant mixing of these allele sets keeps them reproductively compatible. Once separated, though, they move closer and closer to having sufficiently different sets of alleles that their DNA is no longer inter-compatible, and they diverge as separate species. This will generally take lots of generations because of the rate at which new mutations accumulate which for humans, is about 350 per new birth, a tiny number in terms of the genome. 

Looked at from just the right perspective, all of the above can be said to be underpinned by one thing: Extinction. Using an example of divergence as described above, this is how it works. Look at the cladogram on the right.


If you think of the boxed area as 'a moment in time', we can think of all the organisms in the box as being extant simultaneously. The parent (below the divergence) is still capable of reproducing with the offspring on both sides of the divide (although the resulting offspring may have a suspicious talent for playing the banjo) thus, even though the offspring themselves may not be capable of producing fertile offspring, gene flow can still occur. Of course, this is a vast oversimplification for the purpose of illustration. Such an occurrence is unlikely in reality, as such incompatibility doesn't generally arise in so few generations, but it isn't impossible.


Now let's look at a later snapshot.

Here, we can see the same cladogram but highlighting a later moment in time. The parent organism is now 'extinct', and no gene flow can occur between the remaining organisms, which means that they are different species. While there were still members of the parent species carrying a set of alleles that are compatible reproductively with members of both daughter species, they were a single species. Once the last member of that population dies or passes reproductive age, the divergence is complete. Thus speciation ALWAYS occurs with extinction of the parent alleles at a given moment in time, because it occurs at the moment at which the last member of the population through which gene flow can occur is removed from the population.

It's also really important to remember that the dots in these diagrams don't represent individual organisms, but populations. 


Another good way to visualise how extinction and speciation go hand-in-hand is to think about an extinction event in a ring species.

A ring species is a population of organisms that has spread out in a ring. A notable ring population is the ensatina salamander population of California (there is some controversy over whether the ensatinas are actually a ring species, but that's not massively important for our purposes). This is a population of several subspecies of salamander distributed around California. The diagram shows how they're distributed. 

The population starts at 1, and works it's way around the ring clockwise. Subspecies 1 isn't reproductively compatible with subspecies 7, at top left, but it is with 2, and 2 with 3, etc. There's probably also some overlap, so one could be compatible with 3, for example. Again, for the purpose of this illustration, whether that's actually true of the ensatinas isn't really relevant, as we're just talking about the principles. 

Now let's look at what happens if there is a disaster, such as a bolide impact somewhere around the ring. 

We can see now that, because subspecies 3 and 4 have gone extinct, there is no gene flow between 2 and 5. Because the species at the ends of the ring are not reproductively compatible, gene flow between the Eastern and Western populations has now ceased, meaning that they are now separate species. This is extinction directly driving speciation.

All of the above is what those in the relevant fields call 'macroevolution', and they're all observed.

Of course, what the creationists are looking for when they use the 'kinds' nonsense is something that not only isn't predicted by evolutionary theory, but would actually falsify evolutionary theory at a stroke. What they're looking for is something along the lines of a fish giving birth to a vole, or some such. I hope I've already given enough information here to show what an absurdity that is.

Before we move on, we also need to take a look at one more term, a term that has caused much of the objection to evolutionary theory from the get-go. That term is 'random'.

It's a little-known fact that the earliest objections to Darwin's work didn't come from the religious, it came from a different quarter entirely, and a surprising one at that: Physicists.

The big problem was that, since Newton, it was thought that the universe was just like a big clockwork machine. Pierre-Simon Laplace famously stated that, with Newton's work, all we needed to know was the position and velocity of every particle in the universe and we could predict with perfect accuracy any past or future state.

This is now known as Laplacian determinism. All the physicists of Darwin's day were Laplacian determinists. They couldn't countenance the idea that there were random elements at play in the universe. But what do we actually mean by random here? 

Random, in the way that I employ it, and in the way that it arises in evolutionary theory, means 'statistically independent'. It does not mean, as some suppose 'uncaused'. It simply means that, of a range of possible outcomes, any one outcome is statistically as probable as any other outcome.

To give a concrete example of something random, we can look at the decay of a single atom. The moment of decay of, say, an atom of caesium, is entirely random. It can happen any time from the moment the atom first arises to the heat-death of the universe. There's absolutely no way to predict when it will decay. Each of those moments, and thus the time of decay, is statistically independent.

So, Darwin had introduced the random and the probabilistic, and the physicists weren't too happy about it. Ludwig Boltzmann, father of thermodynamics, cites Darwin as one of his major influences in the formulation of statistical mechanics, and describes him, having laid the groundwork that would ultimately result in quantum mechanics, as the greatest physicist of the 19th century.

Anyway, a bit of a digression but, I think, and interesting one. 

So, there are random elements in evolution, both in the mutations that drive diversity, and the selection that attenuates it. But that can't be right, can it? Richard Dawkins says that natural selection is the opposite of random! And he's right, of course, but there's a danger of equivocation, so let's look at what he means, and what I mean, and see if there really is a contradiction.

Natural selection has to be looked at in two ways to be fully appreciated. The first is from the perspective of the population, at which level the effects of selection are seen. At this level, NS is most definitely not random, because it can be probabilistically quantified. At this level, we see that, on average, advantageous alleles are selected for, in the form of being passed on to future generations with a statistical weighting. We also see that, on average, deleterious alleles are selected against, in the form of not being passed on to future generations, again with a statistical weighting. This is what Professor Dawkins is talking about. It can't be said enough times that evolution is a population phenomenon, and statistical in nature.

The second way to look at NS is from the perspective of the individual organism, at which level selection actually operates. From this perspective, NS is random. The particular selection pressure that an individual organism will succumb to or indeed evade, is statistically independent, thus random. The organism with an allele that allows it to evade a particular selection pressure has statistical significance, but the means of checking out without issue are many and diverse, and which particular pressure said individual will fall prey to (pardon the pun) can only be treated in the broadest of terms. An organism can be the strongest, fastest, best-equipped predator, an alpha in every sense of the word but, like the ensatinas above, if he gets hit by a big flaming rock from outer space, he's fucked.


Properly, evolution is neither random nor non-random. It's stochastic, which means simply that future states of the system are contingent upon initial conditions plus one or more random variables.

A simple example of a stochastic system is ten coins. Put them down on a table. Some will be heads, some tails. These are your initial conditions. Pick one of the coins, execute a coin toss and put it back in place on the table with the winning side showing. These are your new initial conditions. repeat this exercise as many times as you like. At each stage, the future evolution of the system is neither entirely random nor entirely non-random. It depends on the initial conditions, which are reset after each toss, and one or more random variables, in this case, the particular coin you choose and the outcome of the toss.

Now let's add a little complication. Coins can land on their edges. There's also a robot that doesn't like edges, so it knocks them flat. It's not a very good robot though so, sometimes, it will miss. Moreover, sometimes when it misses it will actually knock a flat coin up on its edge. 

We now have a system in which advantage plays a part, but where the efficacy of that advantage is itself statistical. The system will keep evolving, and the robot will tend to keep the numbers of the 'unfit' coins down, so that there is a statistical bias toward coins that are lying down, but the occasional edge case will still make it through the filtering process.

And of course that's what selection is; a filter. Like most filters, it isn't perfect. Some organisms carrying advantageous alleles for the environment will be filtered out by being subjected to a bolide impact, or a random snake-bite, or spider-bite, or lose footing at the top of a cliff, or some such. Thus, at the level of the individual, selection is random. At the level of the population, there's a heavily -weighted bias toward advantageous alleles.

Even with entirely random events, we can extract statistical information. The science of radiometric dating is based on this. The decay of an individual atom is entirely random, but get a large enough collection of them and we can put numbers on how quickly a sample will decay. We call it the half-life, and it describes the amount of time that half of a given element will decay. I won't dwell further on isotopic decay here, because I have plans for that topic on the table.

Before finally moving on, a quick word about fitness.

In the popular view of evolution via natural selection, the common catch-phrase is 'survival of the fittest'. It's important to know just what's meant by fitness here, not least because a straight vernacular treatment of it would tend to suggest that the strongest survive, when that's simply not the case. That phrase only has meaning when the full technical definition of 'fitness' is employed, in which it's a measure of performance against an expected average with regard to number of offspring. In short, what defines fitness is reproductive success. As alluded to above, even the strongest, fastest organism can fail to be represented in future generations. Indeed, what actually defines a trait as advantageous or deleterious is not a function of the trait itself, but a function of the environment the trait finds itself in. 

A clear, if less than wonderful, example of this is the sickle gene. One would expect, on a narrow, cursory assessment of this gene, that it should be selected against, because if you carry two copies, the likelihood is high that you'll suffer from sickle-cell anaemia, probably from about your early to mid twenties. In regions where malaria is rife,  however, carrying a single copy increases resistance to malaria, so it isn't selected against, it's selected for. In fact, since sickle-cell anaemia usually doesn't manifest until well into productive years, it isn't strongly selected against even in places where malaria is not prevalent.

So, now we've deal with what evolution is, and what it isn't, more importantly, let's look at a couple of cases. The literature is replete with observations of speciation, allele variation, fixation and extinction. I'd wanted to aim for brevity in this post, and I'm keenly aware of how spectacularly I've failed (and that's even having left out a fair bit for now), so I'm going to restrict myself to two examples. Since I want to expend some real estate on the latter, I'm going to be inversely verbose on the former. I'll provide links at the bottom to some useful resources on observations of evolution in action.

The first is a long-running experiment using bacteria, Escherichia coli, at Michigan State University, overseen by Richard Lenski. Starting with just 12 populations of the same strain of E. coli, the experiment has been running since 1988, and passed the 60,000 generation mark some two years ago. This experiment has seen several speciation events, along with the evolution of new traits, such as the ability to transport citrates in an aerobic environment. While these bacteria could already process citrate, they were unable to use it as an energy source in aerobic environments. The results of this research have been pretty spectacular, and the experiment is ongoing.

The details can be found in their inaugural paper, Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. I'll include a link to the paper at the bottom.

The second example is that nice surprise I was hanging onto from earlier, and it involves a speciation event in butterflies.

In a study funded by the Smithsonian Tropical Research Institute and others, Jesús Mavárez et al spent some time studying several species in the genus Heliconius. They noted that Heliconius heurippa was intermediate in wing pattern between two other species, H. melpomene and H. cydno, and hypothesised that this was a result of a hybridisation. Genetic analysis lent weight to this when it was discovered that certain alleles carried by H. heurippa were a mixture of alleles from H. melpemone and H. cydno.

But it gets even more interesting!

The experimenters decided to do some cross-breeding in the lab between H. melpemone and H. cydno to produce 'F1' hybrids, and then back-crossed the fertile males to females of both species. When breeding with H. melpemone, the melpemone wing pattern returned. However, when they back-crossed with H. cydno to produce an 'F2' generation, and then selectively bred the F2 generation, the intermediate wing pattern emerged, almost identical to the wild heurippa.

Were they finished? Not likely.

They finally bred the lab-borne heurippa against the wild specimens, they produced fertile offspring with the heurippa wing pattern.

Similar speciation models are now being pressed into service to explain, among other things, diversity in the cichlid fish populations of lakes Victoria, Malawi and Tanganyika.

The above represents an absolutely minuscule amount of the evidence in support of evolution. The astute reader will have noticed that evolutionary biology isn't really my area of interest. My penchant is for physics, and I can say without fear of reprisal that most of the physicists I know would sell their grandmothers for the amount of evidence there is for evolution in their own fields. Evolutionary theory is about the best-supported theory in all of science. 

Gravity and Free-Fall

Had a discussion with physicist and author Marcus Chown yesterday on Twitter about gravity, and how bodies in free-fall don't experience it. We were trying to explain this to somebody, and were struggling, partly due to the character limit. I went looking for a video demonstrating it, but struggled to find one, despite the fact that I know Neil deGrasse Tyson used a similar example in a scicafe lecture some years ago, so I made one.

The original was a bit rough, but I re-shot it today, and made some alterations to the flow of it. to make it an easier watch.

This is one of the beautifully elegant physics experiments that anybody can do at home, pretty much cost-free.

Apologies to Matt Damon for the use of fresh water. It was only a bit.


Will expand this post at some point for a deeper explanation with some equations, but wanted to share.

Enjoy!

The Certainty of Uncertainty

One of the more difficult things to grasp in modern physics is the vast range of consequences of the central law of quantum mechanics, Heisenberg's Uncertainty Principle (HUP), some of which can be extremely subtle. This particular wander through thought-space is an attempt to clarify some things, not least what I see as a major misconception concerning a recent popular physics book, Lawrence Krauss' A Universe From Nothing.

This particular post has been pinging about the pinball machine that is my mind for a few years, pretty much since the release of said tome, motivated by several discussions I've had with various ISBNbots, creationists and even some with a degree of scientific literacy. It is, of course, not remotely beyond the realm of possibility that the misconception is mine, so I look forward to any corrections, as always. This is my own précis of the material, and my reading of what Krauss said in the book and in the lecture that preceded its publication, which I highly recommend, by the way.

I don't intend to treat Krauss' book in any detail on this outing, because the ideas themselves will form part of the Before the Big Bang series. Here, I simply want to look at the consequences of some of the things he talked about in that book, along with some of the things he didn't, and to cover some of the implications of HUP aside from how they apply to particles. I'll also be covering what the evidence is for the effects those implications predict. Of necessity, some of what follows is going to deal with topics I've touched on before, so apologies to regular readers.

Let's start with a little potted history; in this case, the history of our ideas about light:

In the 5th century BCE, Empedocles of Agrigentum postulated the cosmogenic theory of the four classical elements; earth, air, fire and water. He believed that the eye had been made from all four elements by Aphrodite, and that the fire shone from within the eye in rays, making sight possible. This was not without obvious problems, such as why we could see better during the day than at night. Empedocles proposed an interaction between the rays from our eyes and the rays from other sources to account for this.

Hit the timewarp button briefly, and whisk yourself forward to about the late 3rd or early 4th century BCE, and pay a visit to somebody whose name is familiar to teenagers the world over: Euclid. Famous for his book Elements, still the foundational work on geometry today, he also wrote a book on optics, treating reflections geometrically, and challenging the view that light propagated from the eye on the basis that, on opening your eyes at night, it should take a finite amount of time for you to see the stars. The only known solution to this was to propose that light travelled infinitely fast.

For many centuries, the debate between light as particles and light as waves bounced back and forth, with the sway mostly going to the wave models. Descartes thought that light was a mechanical property of the light-emitting body and the transmitting medium. His ideas appeared in a work on refraction and, while some of his conclusions were wrong (he thought that light travelled faster in a denser medium, because sound does), his theory formed the foundation for mechanical optics. 

Christiaan Huygens proposed a mathematical wave theory of light, and postulated the 'luminiferous aether', as waves were thought to require a medium in which to travel. Newton, on the other hand, thought light was made of particles or, as he called them, 'corpuscles', on the basis that waves could turn corners, while light travelled in straight lines.

Newton's model became widely accepted and ruled the roost for over a century until, in around 1800, Thomas Young devised an experimentum crucis that seemed to nail the case in favour of waves. It was known that waves could interfere with each other, so a suitable experiment should show a characteristic pattern of interference. Where the peaks of two waves meet, the resulting peak would be higher, where the troughs of two waves meet, the resulting trough would be deeper. Where the peak of one wave and the trough of another wave meet, they cancel each other out. Music production professionals use this fact in a test known as a 'null' test, in which a recorded signal is copied and thrown out of phase to ensure that nothing is being added to the signal during processing. Because the original signal is cancelled out, any sound that remains is being added during processing. As an aside, this is also how expletives are removed from recordings for radio broadcast, and how noise-cancelling headphones work.

Young's experiment involved shining a beam of light through a board with two slits in it some way apart. If light was made of particles, one would expect to see two lines on the back screen. If it were made of waves, one should expect to see many lines, indicating interference, as in the figure below.


http://www.viewzone.com/light.xy.gif

What he saw was, of course, the interference pattern indicating waves. One would have thought that the case was closed.

This experiment has huge importance, so we'll be coming back to it.

Another important issue was polarisation of light. Anybody who has polarising sunglasses will be aware that the light allowed through changes based on how the lens is rotated, thus:

In 1816, André-Marie Ampère proposed that this could be explained if light 'waved' orthogonally (at right angles) to the direction of travel. August-Jean Fresnel, based on this, worked out a theory of light, later added to by Simèon Poisson, which finally overturned Newton's view.

In 1845, Michael Faraday worked out experimentally that the plane of polarisation could be changed in the presence of a dielectric, an insulator that can be polarised by applying an electric charge, an effect known as Faraday Rotation. This was the beginning of the idea that light and electromagnetism were related in some way.

By 1850, the speed of light had a reasonably robust measure for the first time, measured by Lèon Foucault, famous for his pendulum experiment showing that the Earth rotates. He devised a beautiful experiment along with Hippolyte Fizeau. After some refinements, he finally settled on an absolute speed for light in 1862 of 298000 km/s, so that he could get an accurate figure for the astronomical unit (AU), the mean distance between Earth and the sun.

James Clerk Maxwell, inspired by Faraday's work, studied electromagnetism and light in some depth, and worked out that electromagnetic waves propagate at the same speed as light, which led him to infer that electromagnetism and light were the same thing. This was later confirmed by Heinrich Hertz, who showed that radio waves exhibited all the same behaviours, namely reflection, interference, refraction and diffraction. Further, and importantly for our purposes here, Maxwell's equations described fields. I should also note here that Maxwell's use of the speed of light in his field equations included no term for the motion of the source or the observer. Some sources suggest that the term for the speed of light was only used for mathematical consistency, but this is difficult to confirm, especially in light of the consilience between his measurements and those of Foucault.

The one question that hadn't been answered with any rigour was how light waves propagated without a medium. The luminiferous aether had been around since Huygens as a concept, but nobody had been able to demonstrate its existence. Several possibilities were considered, first, that the aether was stationary and partially dragged by Earth in its orbit, second, that aether moved with the Earth. Eventually, the latter hypothesis was discarded in favour of the former, a hypothesis that owes itself greatly to Fizeau.

Then, via some very clever reasoning, it was worked out that, given a stationary aether, the motion of the Earth through it should generate an 'aether wind', which should be detectable because, at different points in Earth's orbit, it was moving in different directions relative to the aether. In 1887, Albert Michelson and Edward Morley of what is now Case Western Reserve University,  devised an experiment called a laser interferometer - a huge version of which is making a big noise today in the news, in the form of LIGO, the gravitational wave observatory - that should be able to detect differences in the measured speed of light toward, away from, and perpendicular to the motion of the planet, making the aether detectable. This is the now-famous Michelson-Morley experiment, and it was really elegant. 

In essence, it's a single beam, split into two and sent along two arms at right-angles to each other and then reflected back. Because the two arms were the same length, any difference in the speed travelled should manifest in the detector as an interference pattern, because of the shift in phase, in exactly the same way as in Young's double-slit experiment. One arm was aligned with the direction of motion, with the other arm orthogonal to it. 



This has gone down as about the most famous null result in the history of physics. Everybody expected the result to be positive, and for some years, many people were trying to work out what had gone wrong with the experiment.

Then, in 1905, along came a man working as a patent clerk in Switzerland, and he changed everything. First, he published a paper entitled On a Heuristic Viewpoint Concerning the Production and Transformation of Light, a paper on the photoelectric effect. This paper comprehensively put the cat among the pigeons, because it showed that light came in discrete units or 'quanta' (what we now call 'photons'), opening up the wave/particle debate that everybody had thought closed since Young.

He followed that up with another paper entitled On the Motion of Small Particles Suspended in a Stationary Liquid, as Required by the Molecular Kinetic Theory of Heat, a paper on Brownian motion, which provided the first hard empirical evidence in support of atomic theory, and opened physics up properly to statistical methods. 

These two papers between them are widely regarded as being the ultimate foundation of quantum theory - a theory that Einstein himself never accepted - and were the work upon which his 1921 Nobel prize in physics was based.

Finally, he released a third paper, On The Electrodynamics of Moving Bodies, which brought with it an entirely new way of dealing with space and time, and overthrew Newtonian mechanics, which had stood for about 250 years without serious challenge. We've come to know it as the special theory of relativity (special because it deals with a specific set of circumstances, one in which gravity plays no part). It's unclear whether the Michelson-Morley result entered Einstein's thinking in formulating this theory, though he was certainly aware of it. Einstein himself said that he merely ran with the appearance of c in Maxwell's equations with no mention of the motion of source or observer, and tried to work out what it would imply for it to have the same value of every observer, regardless of motion. In order to accommodate this, space and time had to stretch and squeeze, and in fact they had to be unified into a single entity, spacetime.

Clearly, there was a problem here. One validated theory tells us that light is made of particles, while another validated theory tells us that it's made of waves, so what gives? Is this a contradiction? To answer this question, we need to leave light for a bit and talk about heat. Specifically, black body radiation.

In the early days after Einstein, some work was going on in a different field, starting with Einstein's friend Max Planck working on problems with black body radiation. He'd been trying to solve a problem with working out the energy in an oven. He'd begun by adding up all the frequencies of energy that should be contributing, and to his surprise, he discovered that the energy should be infinite. This was obviously nonsense, or a bit of melted chocolate would have been a forgotten footnote in the history of physics, rather than the basis for a new culinary technology. Clearly something was wrong, but what was it? After much mucking about with the equations, he realised something interesting, and it's all to do with how waves behave.

Look at this picture. It shows a periodic sine wave. You can see that this wave cycle begins at one edge of the image and ends at the other. It also illustrates the zero-point, which is where the amplitude of the wave is zero. What Planck realised was that, if he included in his calculations only those frequencies of energy whose wave returned to the zero point exactly at the wall of the oven, the calculations worked out and gave the correct energies.
This principle allows any frequency in which the energy returns to the zero point, even if that point is halfway through a cycle.

He realised that this meant that energy was quantised, which meant it came in discrete units. If you couldn't get back to the zero line at the wall, you couldn't join the party. This meant that any of the following were perfectly acceptable.

 
While the  following are not:

This was the birth of Quantum Mechanics, and it brings us to Heisenberg's Uncertainty Principle. 

So what is it?


HUP is the central law of quantum mechanics, after Werner Heisenberg, who formulated it. It deals, in a nutshell, with how much information can be extracted from a system. In natural language, it says that, for any of several pairs of quantities known as 'conjugate variables', the more information we extract about one of a pair, the less information we can extract about the other. 

Here's the critical equation again, for the pair of variables 'momentum' and 'position':

\begin{equation} \Delta p \Delta x \geq \hbar/2 \end{equation}

Where Δ (delta) denotes uncertainty, p denotes momentum, x denotes position, and ħ (h-bar) is the reduced Planck constant. The Planck constant is given in joule seconds and has the value 6.626×10−34 Js. The reduced Planck constant (also known as Dirac's constant) is obtained by dividing this result by 2π to give 1.055×10−34 Js.

What the equation tells us is that the uncertainty in momentum multiplied by the uncertainty in position can never be less than this tiny number, ħ/2. The pair of conjugate variables most discussed is the momentum and position of a particle, but there are many such pairs, such as angular momentum and orientation, energy and time, etc. 

Let's simplify by looking at a single particle, so that we can see close-up how this principle applies in the real world (for a given value of real; let's not get ontological). In this case, we want to put our particle in a box. Since we don't have electron microscopes for eyes, we won't actually be able to see the particle.

Here's our box, then:


Our particle is wandering around the box just as you'd expect (this is a simplification, for reasons that will become clear). Now let's shrink the box a little:

Not much has changed except, of course, that now we've pinned down the position of our particle more precisely, which means, as per the uncertainty principle, that we've lost some information about its momentum. The net effect of this will tend to be an increase in momentum, so that the particle is moving around more quickly than it was when the box was larger.

Now let's shrink the box even further: 

 
 Now the particle is fairly whizzing about. It results in behaviour so extreme that, the more you shrink the box, the more probable it is that when you look in the box the particle won't even be there! It will have tunnelled through the wall of the box and will now appear on the outside, or even on the other side of the universe.

I should also note here that, even in the case of the larger box, there is a non-zero probability that the particle will be found on the other side of the universe, but this probability increases the more you constrain the position of the particle. Moreover, because of this relationship, and because you can't pin down the momentum of the particle once it's position is constrained, you also can't tell where the particle will be a moment later.

This is the uncertainty principle in a nutshell.  

Just as an aside, two extremely interesting things come about as a result of this process. The first is hugely important to us, because without it we couldn't live. This 'quantum tunnelling' process is hugely responsible for fusion in stars, because quantum tunnelling is what allows hydrogen atoms to overcome the Coulomb barrier, without which the probability of fusion would be massively reduced. The second is quite important to me today because, without it, I wouldn't be able to share these thoughts with you. Why? Because quantum tunnelling is also the principle underlying the operation of the transistors in the technology I'm employing to write this missive. This principle first found light in a device known as the Esaki diode, after Leo Esaki, then a researcher at what would later become Sony, and for which he won the Nobel prize in 1973 for the first experimental demonstration of quantum tunnelling. The transistors in your computer are basically arrays of Esaki diodes.

Let's go back to our double-slit experiment, because we really need to try to resolve the dispute between particle and wave for our photons. This time, though, we're going to look at it slightly differently

One might think, given that we have good reason now to suppose that light is indeed particulate in nature, that the interference pattern arises because the photons from our beam are interacting with each other, given the huge number of photons being fired at the screen. So what happens when we fire one photon at a time? Surely, we should now see the particle pattern of two lines on the detector screen? Let's see. Here's a time-lapse of the experiment done for real:

https://giant.gfycat.com/BoilingEqualGroundhog.gif


As you can see, even firing individual photons results in the same interference pattern, which means that the photons must be interfering with themselves! That must mean that single photons travel through both slits at once! Can this shit get any weirder?

Actually, yes it can. What happens when we place photon detectors next to the slits?

http://www.viewzone.com/light.detectors.gif

What? The photons 'know' when we're looking at them!

Many iterations of this experiment have been done, including setups that tagged the photons on route and then removed the tag, and we get interference. Any method we can devise to determine which slit individual photons went through results in the same thing. There are even iterations, known as 'delayed choice' experiments, in which we can decide whether or not the detectors are on after the photons have left the source. No matter what we do or how sophisticated and Blackadderesque our cunning, if we extract 'which path' information from the system, the interference pattern disappears and we get a particulate result.

The only reasonable conclusion we can draw is that light behaves like both a particle and a wave, depending on how we interact with it. 

What about other particles? There's no solace there, either. This experiment has been conducted with all manner of particles, and even with larger objects, like buckyballs (buckminster-fullerenes; hat-tip to chemist Sir Harry Kroto, who discovered them, and who died a few days ago). The result is always the same, which tells us that all particles display particle-like and wave-like behaviour. This is the famous 'wave-particle duality'. Some suggestions have been made to explain this, such as 'wavicles', but the best answer comes, we think, from quantum field theory (QFT), namely that there is no particle, and no wave, but that both behaviours are manifestations of the behaviour of something else, and that something else is where we repair to next.

The interesting implication in our immediate context is what happens when we apply this principle to fields.

Spacetime is pervaded by many fields (some would argue that spacetime itself is a field). At its most basic, a field is simply a description of the changing values of some parameter from place to place and from time to time. Maxwell's equations for electromagnetism, for example, are field equations; that is, they describe electromagnetic radiation as it varies from place to place and time to time. In QFT, all our interactions are manifestations of the behaviours of fields. Where the parameters of a field have the values, for example, 0.511 MeV/c2 mass (particle masses are given in electron-volts divided by c2 - units of energy, because E=mc2, meaning that m=E/c2 - or as multiples of the proton mass), -1 charge and spin ½, the field manifests itself as an electron.

The take-home here is, of course, that when we interact with a field in a certain way, we see a particle. When we interact with it in another way, we see a wave. In short, neither manifestation actually exists in and of itself, only as a behaviour exhibited via our interactions. This is the famous and oft-misunderstood 'observer effect' (most often misunderstood as requiring the observer to be conscious; most interactions involve photons, which are not, as far as can be ascertained, conscious).

When we apply HUP to a field, we employ the same equation given above, but now our parameters 'position' and 'momentum' become the parameters 'value' and 'rate of change'. Thus, applying the uncertainty principle to a field tells tells us that, the more closely we constrain the value, the more uncertain we are about the rate of change and, just as with our example of the particle in the box, the more uncertain we are about what that value will be even a split-second later. In short, the value itself, and the rate of change, don't have any independent existence, they're simply the results of interactions.


When we put all of the above together, an interesting consequence arises, and this is where we really begin to make contact with what Lawrence Krauss was talking about in his book and lecture. That consequence is, of course, that 'nothing' can't persist. In order for that to occur, we'd be looking at a field whose parameters could both be known with arbitrary precision and absolute certainty, because both parameters would be zero. There would be no value and no rate of change, which clearly violates the uncertainty principle. In short, there must be something. 

Is there experimental evidence in support of this conclusion? You betcha!

For that, we need to look at a thought-experiment by one of those crazy Dutchmen, in this instance, physicist Hendrik Casimir:

The uncertainty principle implies that a field cannot remain at a fixed value, including zero. This, in turn, implies that there must be activity in the field all the time, which would manifest itself as energy. Touching on what Planck discovered about waves as discussed above, he reasoned that, if one were to put two charged plates very close together, there should be wavelengths of energy that were disallowed, because they couldn't complete a half or full cycle between the plates, while there would be greater energy outside the plates where all wavelengths were allowed, and this should generate a differential pressure, which forces the plates together.

 


This experiment has been conducted under lots of conditions, including in a vacuum bell jar, and the effect is measured, exactly in line with predictions. These energy fluctuations, which have come to be called 'virtual particles' (virtual because they're short-lived, not because they aren't real) borrow energy from spacetime in the form of a differential, manifesting as virtual particle pairs, move apart a little, and then come back together and annihilate.
 
This is what Krauss means when he says that what we think of as 'nothing' isn't really nothing, because the vacuum of space is seething with activity in these fields. However, he isn't, as he's been accused of, redefining 'nothing' because, absent those interactions, there really is precisely nothing. Those particles arise from nothing, and then return to nothing.


What about the universe? What Krauss is proposing is that, given a sufficiently large fluctuation, the universe can literally arise from nothing, via the same mechanism, and with the same underlying principle, Heinsenberg's Uncertainty Principle.

I look forward to any corrections or comments.