Mud sticks, especially if you are Gregor Mendel

Aug 03 2012 Published by under Uncategorized

As noted in my last post, Gregor Mendel is famous for two things: being ignored until long after he was dead, and allegedly fudging his data. In this post I will cover the second topic, having already covered the first one. Basically, Mendel has been accused of "adjusting" some of his data to produce a too perfect result. Here, I will try to explain the situation.

Mendel studied pea plants that vary in characteristics such as their height, the color and shape of their seeds, etc (see the picture). By counting the proportions of these characters in several generations of plants, he concluded that these features must derive from paired copies of what we now call genes. He conducted a series of experiments, seven of which are of relevance to the discussion here.

The pea characters that Mendel used. Shown are the characters in the parents (P) and if the offspring when different parents are cross-fertilized (F1).

Mendelian genetics describes and explains the way in which genes are passed from parents to offspring. It makes specific predictions about the genetic makeup of the offspring compared to the parents. However, in biology, there is usually a great deal of random variation around any one set of predictions, so that the predictions apply only "on average". This means that any one sample of offspring may or may not be close to the average. As the sample size gets larger, then the closer and closer we should get to the prediction. That is, the effect of random variation gets smaller as the sample size gets larger.

However, this is apparently not what Gregor Mendel himself observed when working with his peas (Mendel 1866). The results he reported from his experiments (they are all summarized by Fairbanks and Rytting 2001) seem much closer to the predicted values than would be expected given his sample sizes (i.e. number of pea plants). This was emphasized by Fisher (1936) based on his reconstruction of Mendel's experiments, but it had first been noted by Weldon (1902), who said: "if the experiments were repeated a hundred times, we should expect to get a worse result about 95 times".

To evaluate this claim, a scientist can do one or more of three things. First, they could undertake a mathematical analysis of the situation, which is what Weldon did. Edwards (1986) has made the most thorough evaluation of all of Mendel's data based on statistical theory, and concluded that there really are unusually close fits of the data to the predictions. Second, a scientist could bypass the statistics, and conduct computer simulations of the actual experiments. Novitski (1995) had a go at this for some of Mendel's experiments, and also concluded that there is unusual closeness of the results to the theory.

Third, one could compare Mendel's results to the results achieved by other people who tried to repeat his experiments. As a specific example, Mendel crossed pea plants having seeds that were either yellow or green in color (see the picture above), and reported 6,022 yellow seeds and 2,001 green seeds among the offspring, for a ratio of 3.009:1 when the theory predicts 3:1. This seems remarkably close. For comparison, Sinnott and Dunn (1925) described attempts by six other plant breeders to repeat Mendel’s experiments between 1900 and 1909, reporting a combined total of 134,707 yellow seeds and 44,692 green seeds in the offspring. This is a ratio of 3.014:1, which is slightly worse than Mendel achieved. So, Mendel’s results are the closer of the two to the theoretical expectation, in spite of the fact that the second set of data is based on a sample size that is 22 times larger than his. Almost all of Mendel’s results are like this — consistently closer to the theoretical expectations than his sample sizes warrant (eg. in another experiment he got 5,474 round seeds and 1,850 wrinkled seeds = 2.96:1).

However, this is actually not the worst accusation against Mendel. There is one set of experiments in which Mendel's results are not only unexpectedly close to his claimed expectation but this expectation seems to be wrong. In other words, Mendel is too close to the wrong number!

This argument was first presented by Fisher (1936), although his criticism did not receive much attention until the mid 1960s (the centenary of Mendel's work). For this set of experiments Mendel needed to grow the seeds of the offspring of his cross-fertilizations (F1 in the picture above) in order to test his predictions about heredity. Technically, he had a prediction about how many homozygotes and how many heterozygotes there will be among the F1 offspring — he expected twice as many heterozygotes as homozygotes (this is explained in Wikipedia).

The problem he had was that he could identify the homozygotes correctly but not necessarily the heterozygotes. The latter will produce two types of plants if we grow their seeds (hence their name, heterozygote) while the former will produce only one type of plant (see the next picture). What Mendel had to do was grow the seeds from each F1 offspring and see how many types of plant he got — one or two? If he got two then the original offspring was a heterozygote and if he got only one then it was a homozygote. So, the key to his problem is that he had to find the second type of plant from among his seeds, which he thought would occur only 1/4 of the time (see the picture below). There was therefore a chance that he would miss the second type of plant when checking any one heterozygote, and he would thus wrongly conclude that it was a homozygote.

An example of Mendel's experiments, in which he started by cross-fertilizing different parents (P) to produce the F1 offspring, and then self-fertilized these (to form the F2) to find out how many plant types they produced.

The problem is this: how many seeds (from each F1 plant) should Mendel check in order to decide how many types of plant he gets (in the F2)?

This is equivalent to asking: how many times should we roll a dice to decide whether it has a 6 on it or not? It we roll it, say, 20 times and do not observe a 6, are we justified in concluding that there isn't one on the dice? Or do we need to roll it 30 times? Or 100 times?

We can evaluate this situation using modern statistical theory; which is fortunate, because this is precisely the sort of thing we need to be able to do when designing scientific experiments. In Mendel's case of looking for a plant that has a 1/4 chance of being observed, the theory indicates what is in this table for a group of 400 plants (which is what Mendel had):

The theory thus indicates that if we wish to make no mistakes (ie. less than 1 offspring plant scored wrongly) then we need a sample size of 22 seeds from each plant. Alternatively, if we will accept a mistake rate of 1% then we could get away with growing only 16 seeds (but we will wrongly record c. 4 plants out of the 400 as being homozygotes not heterozygotes).

Unfortunately, Mendel says he scored 10 seeds per plant. (He probably had an average of about 30 seeds per plant that could have used.) His error rate is thus at least 5%, as shown in the above table. The consequence for his experiments is shown in the next table, which indicates the number of plants that Mendel recorded for each of his experiments. (Note, he had 6 experiments with 100 F1 plants, for each of which he checked 10 seeds.)

Note that the theoretical expectation is that Mendel will record c. 23 offspring incorrectly, by random chance, thus recording only 377 heterozygotes instead of 400. However, Mendel recorded 399, which is very close to his own expectation of 400. (Note that it is clear Mendel expected 400 heterozygotes, because he repeated experiment 5 when he observed what he considered to be an unexpectedly small number of heterozygotes.) This number of 399 is just within the 95% confidence interval for the theoretical expectation, which takes into account how much random variation we can expect (see Wikipedia). Similarly, a 1-tailed t-test of the statistical hypothesis that the discrepancy between the observed and theoretical results of the six experiments is zero yields p=0.054.

So, Mendel's results deviate strongly from the theoretical expectation, but not statistically "too far", at least by the modern convention of p<0.05. Nevertheless, one of Fisher's daughters, Joan Box (1978), remembered her father regarding this deviation as both "abominable" and "shocking", although in his published paper he merely described it as "a serious and almost inexplicable discrepancy".

Fisher (1936) himself considered several possible explanations for this unexpected result, but found none of them satisfactory. Novitski (2004a) and Novitski (2004b) have provided an alternative explanation, but Hartl and Fairbanks (2007) found fault with this. In turn, Hartl and Fairbanks provided yet another alternative, but it does not seem very convincing, either. It seems endless, doesn't it? Other aspects of the so-called "Mendel-Fisher controversy" are discussed in the book by Franklin et al. (2008), who ambitiously claim to be ending the controversy.

Sociologically, it is difficult to know what to make of this situation, especially given Mayr’s (1982) comment that: “the internal evidence as well as everything we know about Mendel’s painstaking and conscientious procedure make it quite evident that no deliberate falsification is involved.”

Moore (1993) emphasizes that Mendel’s publication is, in fact, the unchanged text of two lectures given to the Natural History Society of Brno (in February and March 1865), rather than being a formal presentation of experimental results. The lectures were designed to arouse his audience’s interest in his new method of experimenting with hybrids, as well as presenting his theoretical explanation of the results. (Apparently, in the first lecture he described his observations and experimental results, while in the second he offered his explanation for them.) Consequently, only some of the data are presented, and he could therefore be expected to report as examples those results that fitted the theory most closely (as Mendel himself put it, the experiments he discussed “lead most easily and surely to the goal”). For example, we know from his correspondence that in parts of his experiments with smaller sample sizes Mendel got ratios varying a long way from the predictions.

Possible alternative explanations include: (i) elimination of the more deviant results, due to a belief that the experiments had failed because of interference from foreign pollen; (ii) repetition of particular experiments until the results approached the predicted ratio, without realizing the bias thus introduced; and (iii) a lack of understanding of the importance of the random variation indicated in Mendel’s heredity theory itself, either by Mendel himself or by over-zealous assistants in the monastery garden where he worked. Mayr (1982) refutes this latter one for Mendel himself, pointing out that he had a good training in and understanding of statistical fluctuations (unlike many of his contemporaries).

Rather oddly, Kohn (1986) suggests that Mendel should be forgiven for having overly precise data whatever the explanation, since the unexpected results are at least in the direction of our current genetic theory, rather than being fraudulently incorrect.


Joan Box (1978) R.A. Fisher: the Life of a Scientist. Wiley, New York.

A.W.F. Edwards (1986) Are Mendel’s results really too close? Biological Reviews 61: 295–312.

Daniel J. Fairbanks, Bryce Rytting (2001) Mendelian controversies: a botanical and historical review. American Journal of Botany 88: 737-752.

R.A. Fisher (1936) Has Mendel's work been rediscovered? Annals of Science 1: 115-137.

Allan Franklin, A.W.F. Edwards, Daniel J. Fairbanks, Daniel L. Hartl, Teddy Seidenfeld (2008) Ending the Mendel-Fisher Controversy. University of Pittsburgh Press.

Daniel L. Hartl, Daniel J. Fairbanks (2007) Mud sticks: on the alleged falsification of Mendel’s data. Genetics 175: 975–979.

Alexander Kohn (1986) False Prophets: Fraud and Error in Science and Medicine. Basil Blackwell, Oxford.

Ernst Mayr (1982) The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press, Cambridge MA.

Gregor Mendel (1866) Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereines im Brünn 4: 3–47.

John A. Moore (1993) Science as a Way of Knowing: the Foundations of Modern Biology. Harvard University Press, Cambridge MA.

E. Novitski (2004a) On Fisher's criticism of Mendel's results with the garden pea. Genetics 166: 1133–1136.

Charles E. Novitski (1995) A closer look at some of Mendel’s results. Journal of Heredity 86: 61–66.

Charles E. Novitski (2004b) Revision of Fisher’s analysis of Mendel’s garden pea experiments. Genetics 166: 1139–1140.

Edmund W. Sinnott, L.C. Dunn (1925) Principles of Genetics. McGraw-Hill, New York.

W.F.R. Weldon (1902) Mendel's laws of alternative inheritance. Biometrika 1: 228-254.


8 responses so far

  • Sam says:

    I have to wonder, while this is an interesting analysis for statistical students to learn how to detect experimental bias, do we care if Mendel fudged his results?

    Right? We aren't talking about leading genetic theory. If we discover Mendel was dishonest, the current genetic theories aren't going to change (we've gone far beyond Mendel in the last hundred and fifty years).

    • Sam, I suspect that people should care even if they don't. The extreme version of fudging is fraud, and there is no boundary between them. The book by Kohn that I cite has an interesting coverage of this topic. If people are being fraudulent to appear to be better than they are, then Mendel would become a role model, which would be a pity, both for him and us. So, I think that story is educational as well as interesting. /David

  • Jim Thomerson says:

    Even back in the '90s there was some movement to realign the teaching of introductory genetics by starting out with DNA, in contrast to the more usual historical story technique. I read on another blog, somewhere, about an attempt to design an introductory course based on current genetics. They were surprised to find that their course was not able to keep up. I suspect there are already a fair number of molecular geneticists, who have come into their speciality from other than a standard (?) biology curriculum, who have never heard of Gregor Mendel. With the rapid expansion of biological knowledge, I suspect that the historical story method of teaching will become rarer and rarer. I used to argue, with some seriousness, with my cutting edge colleagues, that an hypothesis should have survived for at least five years before being incorporated into an undergraduate course.

    • Jim, I, also, have noted a lack of interest in "history" in the allegedly cutting edge disciplines. My own experience is that this means that they re-invent the wheel repeatedly, often badly. In teaching, history is often seen as getting in the way, although one's own personal/professional history is a good source of examples for lectures, which is often appreciated by the students. One thing to be said for science blogs is that it is quite common for them to cover all the things that are left out of the textbooks, such as historical figures and incidents. This is all to the good, as far as I can see. /David

  • Isabel says:

    So Mendel was way ahead of his time, both in coming up with the idea of how genetics works even before he did the experiments, and then in his experimental approach (to a degree that he didn't even comprehend himself if I understood the previous post correctly). So where did all this come from? Are there any clues to how he came up with all these ideas? Also, how he came to decide to break "blending" into discrete traits, and how he figured out which traits to choose for the peas?

    • Isabel, Yes, you have it right — I think that Mendel was ahead of his time in several ways. As for where this came from, that is the "Mendel enigma". We know so little about him, because his papers were all burnt, at his own request. This makes him a rich source of speculation, and some very interesting attempts to reconstruct what he might have done and meant. The break with blending inheritance is the most obvious of his advances, and maybe this simply came from looking at characters singly, which does make it obvious in many cases that the characters stay the same but they end up in different combinations in the offspring. Possibly the most intriguing thing is why he studied heredity in the first place, when meteorology was the bulk of his science, where (as far as I know) he did nothing new at all. /David

  • Jim Thomerson says:

    I think I got this idea from the Mendel Museum (under construction when I visited in 1991). Mendel had some background in physics, where the idea is that interactions of a few simple things produce complex results, and that the simple things are unchanged by the interactions. So, perhaps, he expected inheritance to be the result of interaction of a couple of simple factors, unchanged by the interaction. Makes a good story, anyway.

    • Jim, It certainly makes a good story, but it also makes a lot of sense, at least to me. I moved from physics into biology very early on, but my approach to biology still reflects a lot of that physics background. From that point of view, particularly given the philosophy of the middle 1800s, Mendel's approach becomes quite comprehensible. It shows what a bit of lateral thinking does — biologists some times get too caught up in natural history to see the science! I must visit that museum one day. /David