Archive for: August, 2012

The march of progress ...

Aug 05 2012 Published by under Uncategorized

This will be my last post here at Scientopia. I've enjoyed my time here. Thankyou for reading my posts — I  hope you have learned something new in the past two weeks. Now that my guest fortnight is over, I will be returning to my Blogspot account at the Genealogical World of Phylogenetic Networks. Please drop by, especially on Mondays, when there might be something of more general interest.

In my posts here I have written much about the essential difference between transformational and variational evolution. I pointed out that the former inappropriately supports a "march of progress" story, in which "primitive" organisms evolve into "more advanced" ones. This is a particularly appealing story if you are a human being and happen to think that humans are the peak of evolution; but it is wrong as far as biological science is concerned.

I am very sorry to report that the Federation of American Societies for Experimental Biology (FASEB), which describes itself as "the policy voice of biological and biomedical researchers" in the U.S.A., has not only heard of this story but apparently believes in it. This image of an Advocacy Card is taken from their web site:

You can see that FASEB supports evolution as a science but apparently does not understand it. Actually, I am sure that FASEB does have thousands of members who understand evolution, it's just that none of them saw this Card before it was produced. Still, there is little practical use to having FASEB advocate evolution education if they can't even get their ads right. They need to educate themselves first.

Even worse, as far as I am concerned, is that FASEB was giving away these bumper stickers at the recent 20th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB - July 2012, in Long Beach):

This ironic circumstance was drawn to my attention by the Byte Size Biology blog, which discusses the "march of progress" misconception in more detail.


2 responses so far

Forecasting and predicting sports results

Aug 04 2012 Published by under Uncategorized

In a recent post over at the Phylogenetic Networks blog, I performed a network analysis of the historical results of the FIFA World Cup (a.k.a. soccer). During this process I noticed that much has been written about our ability to "predict" sports results. However, it occurred to me that some fairly simple points about forecasting and prediction have not been emphasized in these discussions; and so I thought that it might be interesting to cover them here.

The first thing to do is make clear the distinction between forecasting and prediction. Forecasting says: "If things continue the way they are, and have been in the recent past, then we forecast that this is what will happen next." Prediction, on the other hand, tells us what will happen next irrespective of what is happening now. In other words, forecasting does not take into account unforeseen circumstances while prediction does.

For example, the weather bureau makes suggestions about tomorrow's weather based on what the weather has been like over the past few days. This is a "weather forecast" not a prediction. In order to make a prediction, the bureau would need to know about all sorts of future events, such as up-coming eruptions of volcanoes, that they know nothing about, and they would also need to know about exactly how weather events in one location will affect other locations.

The main thing to recognize about forecasts is that they only have relevance during the period of time while things continue as they are. For example, the weather bureau forecasts the weather only up to 10 days in advance, because past that time today's weather has little direct influence. Moreover, you will have noticed that even when they do forecast tomorrow's weather accurately, their forecasts several days into the future get worse and worse, because more and more "unforeseen circumstances" can arise over time.

Forecasting has three uses: (i) people who like the forecast outcome can plan for it; (ii) people who don't like the outcome can try to stop it, and (iii) we can learn more about prediction by comparing the forecast to the actual outcome. Use (i) says, for example, that if the weather bureau forecasts sun tomorrow then we can plan to go on a picnic; but it is (ii) that is perhaps the most interesting use. A good example of use (ii) is the "hole in the ozone layer" that was a big issue a few years ago. The forecast was that there would be global problems if things continued they way they were going (the hole was getting bigger). Few people liked that forecast outcome, and so the governments of the world got together to try to work out how to change things. In effect, we have tried to make sure that the forecast does not come true (which it would do if we hadn't done anything). This is a notable victory for the forecasters (who didn't like the outcome, either).

I am not sure what use prediction is. If we actually knew what was coming next then life would hard to live. Imagine actually knowing all of the bad things that are going to happen to you (in among the good things, hopefully), and knowing exactly where and when and how you are going to die. It doesn't bear thinking about! So, I will leave prediction to the crystal-ball gazers.

This brings us to sport. I presume that we do not want an accurate prediction of the outcome of any sporting event. I suspect that if everyone knew the outcome then very few people would turn up to watch it. Indeed, I have seen people leaving the arena in droves once the outcome was beyond doubt, even if the event was not yet over. Much of the pleasure of watching sport comes from not knowing how it will turn out. This makes sports different from the weather.

Nevertheless, we can ask: to what extent can we forecast the outcome of sporting events? In my mind, this depends very much on the type of event. Here are two lists of competitive events for you (not all of them are sports).

List 1

Running (track and cross-country)
Skiing (downhill and cross-country)
Snooker (and billiards)
Figure skating
Motor racing (cars, bikes, trucks, boats)
Show jumping

List 2

Football (all types)
Hockey (both ice and field)
Water polo

You will notice some general differences between these two lists. For example, in the first list the competitors often work alone or in pairs whereas in the second list they usually work in teams. Also, in the second list the competitors directly interact with each other continuously during the event, whereas in the first list they mostly compete alongside each other (except tennis).

However, neither of these things is the basis of the separate lists. What is consistently different is that in the first list the events terminate when a pre-specified goal is reached, whereas in the second list they terminate after a pre-specified period of time has elapsed. For example, in tennis the winner of the match is the person who first wins a specified number of sets, and the match continues for as long as it takes for one of the competitors to achieve that objective. In all types of football, on the other hand, the game stops when the whistle or hooter is sounded, and the winner is whoever is ahead at that second — it is irrelevant what the situation was one second before, or might have been one second later.

What is most important for forecasting is that a competitor's superiority "on the day" plays a much larger part in the outcome of the event for the first group ("objective-terminated") than the for the second group ("time-terminated"). There are simply too many "unforeseen circumstances" (usually referred to as good or bad luck) that can affect the event at the precise moment the event ends, so that the effect of overall superiority is reduced (but not eliminated!) in the List 2 events. Instead of being "best on the day" the winner is "the best at the final second". This means that forecasting will usually be much less successful for the second group than for the first group.

This difference has several consequences for the way competitions tend to be organized. Notably, most of the activities in the first list occur as a series of one-off competitions consisting of a small series of events, and the competitors choose which competitions they will take part in during any one year. In tennis or golf, for instance, there are world-famous competitions in which a winner is declared to be "the best that week" (eg. there is a "US Open" in both tennis and golf). Winning those competitions is important, but each competition stands on its own. On the other hand, the activities in the second list usually have a season full of connected events that are compulsory for all competitors, with the overall winner declared only at the end of the season. This full season is needed because of the reduced effect of superiority on the outcome of each individual event. The way to find out who is "the best" is to have a lot of events, so that all of the "unforeseen circumstances" average out over the whole season.

Another consequence of the difference between the two lists is that for the first group there are often ranking systems, listing the competitors in order of superiority based on their recent competition results. This allows a "winner" to be declared based on a whole year full of unrelated competitions, but it also serves to indicate who are the superior competitors at any one moment, and how big their level of superiority might be. These lists are updated after each competition. The point here is that such a list is only of use if recent results reflect superiority. For the activities in the second list this is not always so.

Another possible use of a ranking list is forecasting, of course. In this case, the forecast outcome of any specified event will be based on where in the ranking the competitors are at the time — the higher you are up the ranking then the more likely you are to be the winner.

I will provide a concrete example here by looking at the sport of soccer (football). The Fédération Internationale de Football Association (FIFA) World Cup™ competition has been played every 4 years since 1930 (except 1942 and 1946, for obvious reasons). The last competition was in 2010, and what we are going to try here is to forecast the outcome for the 32 competing teams. Given what I have said above, this is not likely to be a successful exercise, because in soccer each game terminates after a specified time rather than having any specified goal for the competitors to achieve.

For those of you who don't know, in soccer the ball is round, a team has 11 players (plus a substitute or three), and a match takes 90 minutes (possibly with extra time, and maybe a bizarre lottery called a "penalty shoot-out"). The World Cup finals competition is usually considered to be the most widely viewed sporting event in the world, surpassing even the Olympic Games. Not unexpectedly, forecasting is rampant beforehand, and national pride is at stake in many countries.

At the end of each competition FIFA provides an ordering of the teams (from first to last) based on their success in the finals. It is this ordering for 2010 that we wish to forecast, not just the overall winner.

For comparison, I will consider four different "professional" forecasts, out of the dozens that were available before the competition:

  • Ian Hale's statistical forecast model, published in Engineering and Technology Magazine August 2010, available at
  • the BET365 bookmaker odds, taken from Hale's paper
  • the consensus (average) of 26 bookmaker odds, taken from the working paper of Christoph Leitner, Achim Zeileis & Kurt Hornik, available at
  • La Informacion, a Spanish newspaper at

Furthermore, in spite of everything I have said so far, there are actually two quality rankings of the national soccer teams available. Given what I have said above, this might be appropriate for tennis, golf or chess (all of which also have such rankings) but it may not be useful for football. These rankings come from:

  • FIFA itself, at
  • Elo Rating System, at

I will compare these six forecasts using what statisticians call the coefficient of variation. This number indicates the proportional success of the forecast, measured as the amount of variation accounted for (see Wikipedia), varying from 0 (random forecast) to 1 (perfect forecast). Here are the results:

0.13  Ian Hale
0.14  La Informacion
0.14  FIFA order
0.25  BET365
0.26  consensus bookmakers
0.33  Elo score

These numbers are uniformly low, indicating that forecasting of soccer results is not a successful activity, in general, no matter how you do it. This is the main thesis of this blog post — forecasting competitions where the events terminate based on time cannot be very successful. Even the bookies had only a 25% success rate at forecasting the outcome of the competition, while the FIFA ranking was almost no use at all.

The most successful forecast came from the Elo (quality) score, which indicates that 1/3 of the time the team's quality (or superiority), as determined by their recent results, determined the outcome of the competition (but not necessarily each game). The graph for the data is shown below. The low coefficient of variation is caused by the very poor performance of the highly ranked French (ranked 9th finished 29th) and Italian (ranked 5th finished 26th) teams, and also by the lack of success of the very highly ranked Brazilian (ranked 1st finished 6th) and English (ranked 4th finished 13th) teams.

I have further investigated the results for the FIFA and Elo rankings by checking the result of using them to forecast the outcome of the recent UEFA Euro 2012 soccer championship. The coefficients of variation for this competition are:

0.23  FIFA order
0.22  Elo score

These values are in the middle of the range shown above for the World Cup, so the results seem to be general.

I thus conclude that being the best soccer team in recent games has only 25-30% influence on the outcome of the next series of games. Or, forecasting soccer results will be only 25-30% successful, if you base the forecast on the outcome of recent games. You already suspected that, of course, but in science we like to put numbers on our suspicions, which is what I have tried to do here. What I wonder, though, is how the bookmakers deal with this situation, since their livelihood depends on successful forecasting.

I will finish this post with what I think was the most interesting forecast for the 2010 World Cup. It was wrong, but only just — Germany officially finished third.


3 responses so far

Mud sticks, especially if you are Gregor Mendel

Aug 03 2012 Published by under Uncategorized

As noted in my last post, Gregor Mendel is famous for two things: being ignored until long after he was dead, and allegedly fudging his data. In this post I will cover the second topic, having already covered the first one. Basically, Mendel has been accused of "adjusting" some of his data to produce a too perfect result. Here, I will try to explain the situation.

Mendel studied pea plants that vary in characteristics such as their height, the color and shape of their seeds, etc (see the picture). By counting the proportions of these characters in several generations of plants, he concluded that these features must derive from paired copies of what we now call genes. He conducted a series of experiments, seven of which are of relevance to the discussion here.

The pea characters that Mendel used. Shown are the characters in the parents (P) and if the offspring when different parents are cross-fertilized (F1).

Mendelian genetics describes and explains the way in which genes are passed from parents to offspring. It makes specific predictions about the genetic makeup of the offspring compared to the parents. However, in biology, there is usually a great deal of random variation around any one set of predictions, so that the predictions apply only "on average". This means that any one sample of offspring may or may not be close to the average. As the sample size gets larger, then the closer and closer we should get to the prediction. That is, the effect of random variation gets smaller as the sample size gets larger.

However, this is apparently not what Gregor Mendel himself observed when working with his peas (Mendel 1866). The results he reported from his experiments (they are all summarized by Fairbanks and Rytting 2001) seem much closer to the predicted values than would be expected given his sample sizes (i.e. number of pea plants). This was emphasized by Fisher (1936) based on his reconstruction of Mendel's experiments, but it had first been noted by Weldon (1902), who said: "if the experiments were repeated a hundred times, we should expect to get a worse result about 95 times".

To evaluate this claim, a scientist can do one or more of three things. First, they could undertake a mathematical analysis of the situation, which is what Weldon did. Edwards (1986) has made the most thorough evaluation of all of Mendel's data based on statistical theory, and concluded that there really are unusually close fits of the data to the predictions. Second, a scientist could bypass the statistics, and conduct computer simulations of the actual experiments. Novitski (1995) had a go at this for some of Mendel's experiments, and also concluded that there is unusual closeness of the results to the theory.

Third, one could compare Mendel's results to the results achieved by other people who tried to repeat his experiments. As a specific example, Mendel crossed pea plants having seeds that were either yellow or green in color (see the picture above), and reported 6,022 yellow seeds and 2,001 green seeds among the offspring, for a ratio of 3.009:1 when the theory predicts 3:1. This seems remarkably close. For comparison, Sinnott and Dunn (1925) described attempts by six other plant breeders to repeat Mendel’s experiments between 1900 and 1909, reporting a combined total of 134,707 yellow seeds and 44,692 green seeds in the offspring. This is a ratio of 3.014:1, which is slightly worse than Mendel achieved. So, Mendel’s results are the closer of the two to the theoretical expectation, in spite of the fact that the second set of data is based on a sample size that is 22 times larger than his. Almost all of Mendel’s results are like this — consistently closer to the theoretical expectations than his sample sizes warrant (eg. in another experiment he got 5,474 round seeds and 1,850 wrinkled seeds = 2.96:1).

However, this is actually not the worst accusation against Mendel. There is one set of experiments in which Mendel's results are not only unexpectedly close to his claimed expectation but this expectation seems to be wrong. In other words, Mendel is too close to the wrong number!

This argument was first presented by Fisher (1936), although his criticism did not receive much attention until the mid 1960s (the centenary of Mendel's work). For this set of experiments Mendel needed to grow the seeds of the offspring of his cross-fertilizations (F1 in the picture above) in order to test his predictions about heredity. Technically, he had a prediction about how many homozygotes and how many heterozygotes there will be among the F1 offspring — he expected twice as many heterozygotes as homozygotes (this is explained in Wikipedia).

The problem he had was that he could identify the homozygotes correctly but not necessarily the heterozygotes. The latter will produce two types of plants if we grow their seeds (hence their name, heterozygote) while the former will produce only one type of plant (see the next picture). What Mendel had to do was grow the seeds from each F1 offspring and see how many types of plant he got — one or two? If he got two then the original offspring was a heterozygote and if he got only one then it was a homozygote. So, the key to his problem is that he had to find the second type of plant from among his seeds, which he thought would occur only 1/4 of the time (see the picture below). There was therefore a chance that he would miss the second type of plant when checking any one heterozygote, and he would thus wrongly conclude that it was a homozygote.

An example of Mendel's experiments, in which he started by cross-fertilizing different parents (P) to produce the F1 offspring, and then self-fertilized these (to form the F2) to find out how many plant types they produced.

The problem is this: how many seeds (from each F1 plant) should Mendel check in order to decide how many types of plant he gets (in the F2)?

This is equivalent to asking: how many times should we roll a dice to decide whether it has a 6 on it or not? It we roll it, say, 20 times and do not observe a 6, are we justified in concluding that there isn't one on the dice? Or do we need to roll it 30 times? Or 100 times?

We can evaluate this situation using modern statistical theory; which is fortunate, because this is precisely the sort of thing we need to be able to do when designing scientific experiments. In Mendel's case of looking for a plant that has a 1/4 chance of being observed, the theory indicates what is in this table for a group of 400 plants (which is what Mendel had):

The theory thus indicates that if we wish to make no mistakes (ie. less than 1 offspring plant scored wrongly) then we need a sample size of 22 seeds from each plant. Alternatively, if we will accept a mistake rate of 1% then we could get away with growing only 16 seeds (but we will wrongly record c. 4 plants out of the 400 as being homozygotes not heterozygotes).

Unfortunately, Mendel says he scored 10 seeds per plant. (He probably had an average of about 30 seeds per plant that could have used.) His error rate is thus at least 5%, as shown in the above table. The consequence for his experiments is shown in the next table, which indicates the number of plants that Mendel recorded for each of his experiments. (Note, he had 6 experiments with 100 F1 plants, for each of which he checked 10 seeds.)

Note that the theoretical expectation is that Mendel will record c. 23 offspring incorrectly, by random chance, thus recording only 377 heterozygotes instead of 400. However, Mendel recorded 399, which is very close to his own expectation of 400. (Note that it is clear Mendel expected 400 heterozygotes, because he repeated experiment 5 when he observed what he considered to be an unexpectedly small number of heterozygotes.) This number of 399 is just within the 95% confidence interval for the theoretical expectation, which takes into account how much random variation we can expect (see Wikipedia). Similarly, a 1-tailed t-test of the statistical hypothesis that the discrepancy between the observed and theoretical results of the six experiments is zero yields p=0.054.

So, Mendel's results deviate strongly from the theoretical expectation, but not statistically "too far", at least by the modern convention of p<0.05. Nevertheless, one of Fisher's daughters, Joan Box (1978), remembered her father regarding this deviation as both "abominable" and "shocking", although in his published paper he merely described it as "a serious and almost inexplicable discrepancy".

Fisher (1936) himself considered several possible explanations for this unexpected result, but found none of them satisfactory. Novitski (2004a) and Novitski (2004b) have provided an alternative explanation, but Hartl and Fairbanks (2007) found fault with this. In turn, Hartl and Fairbanks provided yet another alternative, but it does not seem very convincing, either. It seems endless, doesn't it? Other aspects of the so-called "Mendel-Fisher controversy" are discussed in the book by Franklin et al. (2008), who ambitiously claim to be ending the controversy.

Sociologically, it is difficult to know what to make of this situation, especially given Mayr’s (1982) comment that: “the internal evidence as well as everything we know about Mendel’s painstaking and conscientious procedure make it quite evident that no deliberate falsification is involved.”

Moore (1993) emphasizes that Mendel’s publication is, in fact, the unchanged text of two lectures given to the Natural History Society of Brno (in February and March 1865), rather than being a formal presentation of experimental results. The lectures were designed to arouse his audience’s interest in his new method of experimenting with hybrids, as well as presenting his theoretical explanation of the results. (Apparently, in the first lecture he described his observations and experimental results, while in the second he offered his explanation for them.) Consequently, only some of the data are presented, and he could therefore be expected to report as examples those results that fitted the theory most closely (as Mendel himself put it, the experiments he discussed “lead most easily and surely to the goal”). For example, we know from his correspondence that in parts of his experiments with smaller sample sizes Mendel got ratios varying a long way from the predictions.

Possible alternative explanations include: (i) elimination of the more deviant results, due to a belief that the experiments had failed because of interference from foreign pollen; (ii) repetition of particular experiments until the results approached the predicted ratio, without realizing the bias thus introduced; and (iii) a lack of understanding of the importance of the random variation indicated in Mendel’s heredity theory itself, either by Mendel himself or by over-zealous assistants in the monastery garden where he worked. Mayr (1982) refutes this latter one for Mendel himself, pointing out that he had a good training in and understanding of statistical fluctuations (unlike many of his contemporaries).

Rather oddly, Kohn (1986) suggests that Mendel should be forgiven for having overly precise data whatever the explanation, since the unexpected results are at least in the direction of our current genetic theory, rather than being fraudulently incorrect.


Joan Box (1978) R.A. Fisher: the Life of a Scientist. Wiley, New York.

A.W.F. Edwards (1986) Are Mendel’s results really too close? Biological Reviews 61: 295–312.

Daniel J. Fairbanks, Bryce Rytting (2001) Mendelian controversies: a botanical and historical review. American Journal of Botany 88: 737-752.

R.A. Fisher (1936) Has Mendel's work been rediscovered? Annals of Science 1: 115-137.

Allan Franklin, A.W.F. Edwards, Daniel J. Fairbanks, Daniel L. Hartl, Teddy Seidenfeld (2008) Ending the Mendel-Fisher Controversy. University of Pittsburgh Press.

Daniel L. Hartl, Daniel J. Fairbanks (2007) Mud sticks: on the alleged falsification of Mendel’s data. Genetics 175: 975–979.

Alexander Kohn (1986) False Prophets: Fraud and Error in Science and Medicine. Basil Blackwell, Oxford.

Ernst Mayr (1982) The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press, Cambridge MA.

Gregor Mendel (1866) Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereines im Brünn 4: 3–47.

John A. Moore (1993) Science as a Way of Knowing: the Foundations of Modern Biology. Harvard University Press, Cambridge MA.

E. Novitski (2004a) On Fisher's criticism of Mendel's results with the garden pea. Genetics 166: 1133–1136.

Charles E. Novitski (1995) A closer look at some of Mendel’s results. Journal of Heredity 86: 61–66.

Charles E. Novitski (2004b) Revision of Fisher’s analysis of Mendel’s garden pea experiments. Genetics 166: 1139–1140.

Edmund W. Sinnott, L.C. Dunn (1925) Principles of Genetics. McGraw-Hill, New York.

W.F.R. Weldon (1902) Mendel's laws of alternative inheritance. Biometrika 1: 228-254.


8 responses so far

Gregor Mendel and the art of mis-communication

Aug 02 2012 Published by under Uncategorized

In this blog post I will expand the realm of my comments beyond phylogenetics, and consider verbal mis-communication in another part of biology.

Johann Mendel (he assumed the name Gregor upon becoming a religious novice) is famous within biology for two things: allegedly fudging his data, and being ignored until long after he was dead. I will discuss both of these things here, but it is the latter one that I particularly want to talk about today.

Much has been written about Mendel's work being ignored, with all sorts of reasons being proposed. Here, I want to point out something about Mendel's paper that is relevant to my theme of mis-communication in science, which I think contributed to its fate.

Many of you might also know that Mendel has been accused of adjusting some of his data to produce a "too perfect" result. I will return to this point in the next post.

Gregor Mendel may be one of the most famous biologists to have never made a direct contribution to biology. I say this because his published work in the field of genetics (of which he is considered to be the founder) was ignored until it was independently replicated 35 years later, separately by Carl Correns, Hugo de Vries and Erich von Tschermak (although much doubt has been cast on the role of the latter two people). So, if Mendel had never existed then the field of genetics today might be no different to what it is.

He published only one major scientific paper about what we now call Mendelain Genetics. (Most of his research work was on bee keeping and meteorology!) This paper reported the results of a series of plant cross-fertilization experiments involving varieties of garden peas (Pisum sativum). This was published in 1866 in Proceedings of the Natural History Society of Brno, where Mendel also published his meteorological work. It laid dormant there until 16 years after his death. This was in spite of the journal’s availability in the libraries of at least 115 institutions throughout Europe (although such journals were apparently not widely read at the time) and the availability and distribution of a series of 40 reprints.

We know that Mendel did much more work than this between 1856 (when he started the crossing experiments) and 1871 (he was appointed abbot of his monastery in 1868, and thus administrative duties came to occupy most of his time). Although his voluminous notes and manuscripts were burned (either late in his life or after his death), his correspondence with the botanist Carl Nägeli has survived (see Stern and Sherwood 1966 for an English translation). Mendel confirmed his pea results using at least twelve other plant species, although these results were never published, as well as trying inter-species crosses in beans (Phaseolus vulgaris), which are briefly mentioned in his published paper. He had considerable difficulties when examining species of hawkweeds (Hieracium), some of the results of which were reported in his only other (short) publication on hybrids (in 1870). Mendel made no other known attempts to publicize his work, either in writing or at scientific meetings.

Mendel’s work on cross-fertilization is known to have been cited in the published writings of other biologists only about a dozen times before 1900, at which time his main conclusions were independently reached by at least two other botanists working on different species. Hugo de Vries published his work through both the Academy of Sciences in Paris and the German Botanical Society, while Carl Correns published his through the German Botanical Society. These works became generally known almost immediately. However, both authors acknowledged that they had encountered Mendel’s publications after reaching their own conclusions (although some doubt has been cast on de Vries' claim), and so Mendel’s work then became more widely known, even if it didn’t have any direct effect at the time of its original publication.

The obvious question is this: why was the published work ignored? There are many possible answers. MacRoberts (1985) has suggested that Mendel's work was not so much neglected as unknown, while Weinstein (1977) has noted that reports of its existence were widely available, even if the paper was not consulted. Moreover, it's conclusions were important to some people, notably Charles Darwin. Darwin tried unsuccessfully to explain inheritance through his theory of pangenesis, and Mendel's ideas (if Darwin had known of them and accepted them) would have saved him the trouble.

The answer I discuss here is: Mendel mis-communicated with is readers by presenting his work in what was an unusual way for his time. My thesis is this: Mendel was obviously a Galilean while presenting himself as being a Baconian. To explain this statement, I need to give you a bit of background.

Historically, there are four types of experiments that can be recognized (their names have been suggested by Medawar 1979):

Aristotelian — these are contrived experiments intended to demonstrate the truth of an idea. The experiment is forced to fit the preconceived proposition, which usually comes from a particular philosophical view of the universe, and the data are adjusted to fit the theory if necessary (rather than the other way around). This type of experimental procedure was the most common one for most of the recorded history of science.

Baconian — these are natural experiments, where things are simply observed as they really are. The idea is to avoid prejudice and preconception and thus to see reality for what it actually is. Since we might spend a whole lifetime without ever witnessing the particular combination of events that would reveal the truth in any one instance, we are allowed to stretch our experience by manipulating the world in arbitrary ways, to see what happens. Most developments in biology have initially been via these natural-history type of experiments.

Galilean — these are experiments by ordeal, where the world is manipulated in such a way as to discriminate between competing views about how it functions. The idea is to explicitly recognize several alternative hypotheses about how the world operates, and then to deduce the consequences if each of these hypotheses was true. Since these hypotheses won’t all predict the same consequences, it should be possible to manipulate the world in such a way that some of their predictions can be refuted. This the modern type of experiment.

Kantian — these are thought experiments, which try to overcome the fact that the way we experience the world is patterned by our own sensory perceptions. The idea is to create alternative theoretical universes that do not fit our ordinary perceptions, and then to test whether they are consistent with the world as we actually experience it. Most developments in physics in the past century have initially been via these thought experiments, as are most modern computer simulations.

In the 1800s the most common version of science was the Baconian one, and this is how Mendel presented his work. He wrote: "The object of the experiment was to observe these variations in the case of each pair of differentiating characters, and to deduce the law according to which they appear in successive generations." In other words, he says he idid the experiments first, and only later worked out an explanation for the results. Observation comes first and understanding second.

However, Fisher (1936) pointed out that this claim is not compatible with what Mendel actually presented in his paper. Fisher carefully reconstructed Mendel's work from the descriptions given in the paper, and concluded that there can "be no doubt whatever that his report is to be taken entirely literally, and that his experiments were carried out in just the way and much in the order that they are recounted." However, this conclusion led Fisher to a contradiction, because the experiments actually make no sense unless: "Mendel had a good understanding of the factorial system, and the frequency ratios which constitute his laws of inheritance, before he carried out the experiments reported in his paper." In other words, understanding must have come first and observation second, with the observations intended to test whether the understanding was correct.

Fisher reached this conclusion based on two things: (i) Mendel left out information that seems to be essential in order to understand why he did things the way he did; and (ii) Mendel failed to present or analyze a lot of the data that he claimed to have, which he would have done if he was searching for an understanding of the data. That is, Mendel had already worked out his ideas, and did precisely what was needed to confirm them, neither more nor less. As Fisher noted, he could have done more with the data he had, but apparently saw no reason to do so.

The same garden today.

This has several consequences for Mendel's paper, all of which involve mis-communication, which is the idea that I am presenting in this blog post. Mendel mis-communicated because he failed to realize how other people would see his work. He failed to communicate in their "language", and say things in a way that allowed them to immediately understand just how different his work actually was from what had been done before.

First, the substance of Mendel's paper is a hypothesis test and yet it is presented  as a search for information. The content does not match the packaging. The packaging comes from the 19th century while the content comes from the 20th century. As such, the scientists of neither century were likely to make head or tail of it. One has to understand Mendelian genetics in order to appreciate the content, and yet the paper presents itself as being a search for Mendelian genetics!

Once you understand Mendelian genetics, it is easy to see why Mendel designed his experiments as he did, but not necessarily otherwise. This is how Correns and de Vries came across his work — they had already worked out Mendelian inheritance before reading Mendel's paper and thus understood it. For a modern audience, Mendel would need to write his paper with the theory first, and then everything would follow logically. As Fisher noted: "his experimental programme becomes intelligible as a carefully planned demonstration of his conclusions." In modern parlance, "conclusions" would be better expressed as "predictions", as far as experiments are concerned.

There is nothing especially unusual about Mendel's ideas. However, it did take a conceptual leap to put all of the components together. One has to accept that: inheritance is particulate; genetic material is structural and comes in pairs; and each parent contributes equally. This leads to segregation and independent assortment, which are the key components to Mendelian genetics (see Wikipedia). However, instead of presenting this idea and then testing it, Mendel's presented it as a deduction from his results. As such, it is not very convincing.

Second, Mendel leaves out information that would be considered essential for a hypothesis test. His choices of what to explain and what to leave unexplained, form a characteristic pattern throughout his paper; and they are rather idiosyncratic by modern standards.

For example, why did he choose Pisum sativum as his test species? This is important, because this is still considered to be one of the best species to work with for Mendelian genetics. There are many varieties of it available (Mendel used 22), so that it is possible to pick plants that differ only in one, two or three characters (as needed); and these characters are each controlled by only one gene, so that they segregate and assort independently. In addition, the plants are easy to cultivate and they grow quickly. So, how did Mendel realize that he had stumbled upon an ideal species for study? One obvious suggestion is that he had his idea first, and then looked for the right species to experiment upon.

It is noteworthy the types of things that Mendel explained in detail compared to those things he left unexplained. As another example, he gave a number of gardening details that are of practical but not theoretical importance, and yet he did not explain how or why he chose precisely the seven characters that he did (out of the 15 possibilities he listed in the paper), which surely is of great scientific importance (since each just happens to be controlled by one gene).

Third, Mendel called his work "hybridization" when it is really about heredity. This follows his predecessors' terminology, but it creates several sources of confusion. For example, Mendel was doing something completely new but he failed to point this out, by trying to phrase things as though they came from the past. Second, he confused the difference between inter- and intra-species variation. He said that this is merely a matter of degree, and the distinction does not matter; but it actually matters vitally for his work. In modern terms, Mendel was doing "experimental breeding" within a singe species, where the individuals are very similar to each other and inter-breed naturally. Other people were doing "hybridization of species", which differ in a large number of factors and do not normally inter-breed. Mendelian genetics is relatively easy to demonstrate within a species but not between species, and so Mendel's work was actually a significant advance over his predecessors.

Fourth, Mendel failed to distinguish alternative hypotheses that would explain his data, as would be required by a true Galilean experiment. In particular, blending inheritance could also produce his results, and this was the prevailing idea during his lifetime. Mendel contented himself with noting that, although hybrids tend to look intermediate between their parents overall, each individual character on its own is the same as one parent or the other. This is insufficient grounds for dismissing blending. Mendelian genetics predicts specific ratios of characters in the offspring of the crosses, such as 2:1, 3:1, 9:3:3:1 etc, depending on the circumstances. Blending inheritance does not predict any particular ratios at all. So, how does observing 2:1 or 3:1 support Mendel versus blending? These ratios are predicted by both hypotheses! Mendel failed to address this issue.

Mendel's argument here is that the proper way to arrive at a law governing hybrids is to investigate the behavior of specific characters of the hybrids, rather than considering the form of the plant as a whole. It was this decision to look at single characters of plant crosses that distinguishes Mendel's experiments from those of his predecessors. Blending inheritance looks at all characters simultaneously, rather than one character at a time. However, Mendel merely stated his new approach as an unquestioned assumption at the start of his paper. He couldn't really expect to get away with this! If you are going to dismiss as irrelevant the work of all of your predecessors, you had better have a very strong argument. Otherwise, they have no good reason to accept your conclusions over their own.

Fifth, Mendel failed to cite important predecessors. Most importantly, he never referred to Carl Nägeli, and Nägeli never cited Mendel in his own work. However, Mendel did recognize Nägeli as an important scientist in his field, since they corresponded by mail. It is therefore ironic that the only clues we have to Mendel's thoughts about his work are from the letters that he wrote to Nägeli, as all of Mendel's own papers (including Nägeli's replies to Mendel) were destroyed.

Finally, Mendel did not present himself as much concerned about demonstrating the precision or consistency of his results. He presented data that confirm the ratios that he was expecting and then stopped. Moreover, his arguments were entirely statistical. He never observed exactly the ratios that he claimed (eg. a ratio of 2.9:1 was called 3:1), which we interpret as due to random variation. However, this only works if you are already expecting 3:1. If you have no prior expectation, then how can you know that the difference between 2.9 and 3.0 is random variation rather than vitally important? Mendel, however, never addressed this question.

Also, he repeatedly referred to data that he had but did not present (Fairbanks and Rytting 2001 list much of the work Mendel did to collect the unpublished data), and these data could easily have be used to further investigate Mendelian inheritance. So why did he not analyze them? If he was searching for patterns in his data, as he claimed, this seems remiss. He didn't even bother to test his two basic ideas: that both parents contribute equally to inheritance, and that characters are inherited independently. And yet he had data that would test these! Mendel treated these as unquestioned assumptions, rather than as deductions from his work. This is not the behavior of a Baconian, as Mendel claimed to be.

So, Mendel's work was done as though it was part of the early 20th century but was presented as though it was part of the early 19th century. It was thus a strange hybrid that didn't fit easily with the practitioners of either group. If Mendel was writing for a modern audience then he needed to explain more of his background knowledge; and if he was writing for an earlier audience then he should have done more with his data. He was ahead of his time while trying to be a part of his time, and it didn't really work out for him.


R.A. Fisher (1936) Has Mendel's work been rediscovered? Annals of Science 1: 115-137.

Michael H. MacRoberts (1985) Was Mendel's paper on Pisum neglected or unknown? Annals of Science 42: 339-345.

Peter B. Medawar (1979) Advice to a Young Scientist. Harper & Row, New York.

Gregor Mendel (1866) Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereines im Brünn 4: 3–47.

Gregor Mendel (1870) Ueber einige aus künstlichen Befruchtung gewonnen Hieracium-Bastarde. Verhandlungen des Naturforschenden Vereines im Brünn 8: 26–31.

Curt Stern, Eva R. Sherwood (eds) (1966) The Origin of Genetics: a Mendel Source Book. W.H. Freeman, San Francisco.

Alexander Weinstein (1977) How unknown was Mendel's paper? Journal of the History of Biology 10: 341-364.


6 responses so far

Why does a cow have four legs?

Aug 01 2012 Published by under Uncategorized

I think that I have talked enough in these posts about variational and transformational evolution, and so today I will write something about how phylogenies affect the way biologists work as scientists.  Why is the concept of evolutionary history so important in biology?

In one sense, science is about problem solving. We set out to find the answer to a particular question, and in order to do so we have to solve the puzzle posed by that question. So, a relevant point is this: Is problem solving different in the biological sciences compared to the physical sciences (eg. physics and chemistry)? My answer is "yes and no".

To me, there is a way in which they are all the same, and this involves performing some sort of experiment to test any ideas we have that might solve the puzzle. So, as I noted in an earlier post, we all attempt to explain natural phenomena in terms of other natural phenomena.

However, in another way biology is different from physics. First, in biology we have to deal with the fact that organisms respond to their environment, which other physical objects do not do. If I drop a stone then it falls downward, and it will do so no matter where I am on earth. However, if I release a migrating bird it will adjust its flight depending on where it is released. This complicates the study of biology. Second, organisms pass information between generations, via their genes. There is no equivalent concept of inherited information in physics.

The consequence of these differences is that in biology unique historical events can have effects that last for millions of years. Something that happened to one of my ancestors thousands or millions of years ago can still affect me now, because the information about that event has been passed down to me in the genes that I have inherited from that ancestor. There may be no other evidence of that past event except in my genes. No physicist has to deal with this concept — in physics, those past events that have an effect now do so by leaving observable traces in the environment. The laws of physics that operated back then are still operating now, and we can therefore study them now.

This ultimately means that cause and effect can be separated in biology by a great deal of time. The explanation for the natural phenomena that I see now may be a long time in the past. For example, the cause may no longer be important in the modern world but it was in the past. How do I know about its importance back then? The cause may no longer be apparent in the modern world, so maybe I don't even know about its existence. Under these circumstances, even imagining the cause may be difficult. Perhaps most importantly, the cause may be an historical “accident” — a one-off event that has not happened since. We believe that modern biology is actually the result of a whole series of historically unique accidents!

How do I study natural phenomena under these circumstances? This is where a phylogeny comes into play. We use phylogenies as the framework for studying biodiversity, because they take into account the historical component of biological studies. In biology we use them to describe the natural phenomena, to explain them, and to predict them.

Consider the question posed in the title: Why does a cow have four legs? This is a question about explanation, not description or prediction. So, merely describing the legs of a cow is insufficient to answer it, and predicting how many legs the next cow will have is also irrelevant (although it may be interesting in its own right!).

A physicist might attempt to answer this question in terms of balance and stability, particularly while the cow is moving. The idea is that four legs are stable while allowing the animal to graze, walk or run. This tries to explain the presence of four legs in terms of what we know about the number of legs on other other objects in the modern world, and how stable they are while moving.

For example, we know that a 3-legged object can keep its balance while stationary, but three legs is very awkward for moving. (It is usually suggested that a 3-legged object would have to rotate itself to walk, alternately standing on each of its legs.)

Alternatively, we know that two legs is okay for walking and running, since we do that outselves; and if you are Swedish then you are well aware that at least one cow walks on two legs! We also know that insects have six legs and spiders have eight legs, so these are okay for balanced movement as well.

This is Mamma Mu (on the left) and Kråkan (on the right).

However, a biologist would not approach the question in this way. A biologist knows that a cow has four legs because it inherited that characteristic from its parents, both of whom also had four legs. Furthermore, those parents inherited the characteristic from their parents, and so on. Therefore, to a biologist the explanation for four legs may have little to do with cows in the modern world. The cows have a set of genes inherited from their ancestors, and it is those genes that cause them to have four legs (rather than some other number). There may be no particular relevance to having four legs (as opposed to two or six) in the modern world — modern organisms have characteristics because they inherited them, and not necessarily because they need them.

This starts a search backwards in time, through a series of ancestors in the phylogeny, searching for the one who first acquired four legs. The answer to the question about cows' legs then becomes a question about why that ancestor had four legs when its ancestors did not.

Thus, all cows have four legs, and so we conclude that the common ancestor of their species, Bos taurus, also had four legs. Indeed, all cow-like organisms have four legs, and so the common ancestor of the genus Bos had four legs. Furthermore,  all bovine organisms (cattle, bison, buffalo, yaks, etc) have four legs, and so the common ancestor of the subfamily Bovinae had four legs. Continuing, we work our way backwards through the common ancestors of (respectively) the Ruminantia, Cetartiodactyla, Laurasiatheria and Mammalia, all four of which we conclude had four legs.

Eventually we come to the common ancestor of the superclass Tetrapoda, which also had four legs. (After all, that is what the name says — tetra = four, pod = limb). Not all descendants of this ancestor still retain four legs in the modern world, of course. Primates have modified the front pair of legs into arms, birds and bats have modified them into wings, whales (and dolphins and porpoises) and seals (and sea lions and walruses) have modified their front legs into flippers and greatly reduced their hind legs, manatees and dugongs have front flippers but no hind legs at all, and snakes have lost all four legs almost entirely. Still, the common ancestor of all tetrapods had four legs.

The part of the phylogeny involving the origin of tetrapods is shown in the next diagram. There are four groups of organisms involved.

The part of the phylogeny involving the origin of tetrapods.

The main point about this phylogeny, for our question, is that modern lungfishes and modern tetrapods all have four limbs, while modern coelacanths  and other fish do not. The number of limbs is:

Tetrapod = 4 legs + 1 tail
Lungfish = 4 lobe-fins + unpaired tailfin
Coelacanth = 7 lobe-fins + paired tailfin
Ray-finned fish (most fish) = 7 ray-fins + paired tailfin.

Each fin on a fish is designed to perform a specific function, as shown in the next picture. (NB. MostSome fish also have a small adipose fin behind the dorsal fin, for stability.) Note that pectoral and pelvic fins come in pairs, one on each side of the body. The picture provides the physical explanation for why fish have so many fins,  in terms of balance and stability while moving in water.

So, modern lungfish all have fleshy, paired pectoral and pelvic fins and a single unpaired caudal fin. The other fins of most fishes are absent from lungfish. Modern tetrapods have muscular, paired pectoral and pelvic limbs and a single tail. So, lungfish and tetrapods have the same number of fins/limbs in the same places. We conclude that this arrangement has been inherited from their common ancestor, which is therefore the one we are looking for to answer our question. So, what did the common ancestor of these four-limbed lungfish and tetrapods look like?

We are sometimes incorrectly told that: "Scientists have long known that ancient lungfish species are the ancestors of the tetrapods." This idea is largely discredited today. Lungfish are the closest extant relatives of tetrapods, not their ancestors. That is, they are not primitive — they are well adapted to their natural environment. However, lungfish are one of many relict species of fish that share many ancestral characters.

So, did the common ancestor look something like this?


Or more like this?


Probably neither, because we think that both of these fossil species were descendants of the common ancestor in question. But it was presumably more similar to these than to either modern lungfish or modern tetrapods.

So, the initial question about cows' legs becomes: why did the common ancestor of the tetrapods and lungfish reduce their number of fins from seven to four (and no less)? The answer I was given, by the fish biologist W.J.R. (Pim) Lanzing, when I was a student, was that this is the minimum number of fins that can maintain stability and movement in an aquatic environment.

Most of you know roughly how fish move, and how agile they are in water, so I don't need to describe it. However, locomotion of coelacanths is unique to their kind. They have high maneuverability and can orient their bodies in almost any direction in the water. They have apparently been seen doing headstands and swimming belly up. Lungfish, on the other hand, are essentially sedentary. They are reputed to be sluggish and inactive, but still capable of rapid escape movements using their tail. They can can use their paired fins to "walk" underwater, with alternating movements — first one fin moves forward, then the other. They can also use the fins simultaneously to move forward with a lunging action. So, apparently one is not agile in water when one only has four fins, but one can still move around efficiently.

The first tetrapods are thought to have evolved in coastal and brackish marine environments, and in shallow and swampy freshwater habitats. They used their modified, limb-like fins to get around in the water, as do modern lungfish. That is, the origin of legs wasn’t a transformation that happened on land. Limbs were an aquatic innovation that just so happened to be advantageous when tetrapods began to venture out of the water.

So, why does a cow have four legs? Our phylogeny reveals that this is because one of its ancestors (c. 360 million years ago) was aquatic, and four fins is the minimum in liquid. Most descendants have never changed this number, but they did use it to leave the water and live on land. The first part of this answer is a biological one (about ancestry), although the second part is a physical one (about stability and movement).

This is not the sort of answer that can be determined by an experiment. So, scientific problem-solving in this situation is quite different, making biology distinct from physics.

It is sometimes said that physics lies at the heart of science because all explanations ultimately involve physics. That is, the physics explanation lies beyond the biological one — the biological explanation is a proximal one while the physics explanation is the ultimate explanation. This may be so to a physicist, but I do not see why it should be so to anyone else. The biological explanation (in terms of ancestry) is equal to the physical one (in terms of aquatic balance), rather than inferior to it. Moreover, there are explanations beyond the ones that the physicists consider, although that takes us outside science.

In finishing, you might like to try this physics question, instead: Why are tables generally made with four legs and scientific instruments with three?

4 responses so far