As noted in my last post, Gregor Mendel is famous for two things: being ignored until long after he was dead, and allegedly fudging his data. In this post I will cover the second topic, having already covered the first one. Basically, Mendel has been accused of "adjusting" some of his data to produce a too perfect result. Here, I will try to explain the situation.
Mendel studied pea plants that vary in characteristics such as their height, the color and shape of their seeds, etc (see the picture). By counting the proportions of these characters in several generations of plants, he concluded that these features must derive from paired copies of what we now call genes. He conducted a series of experiments, seven of which are of relevance to the discussion here.
Mendelian genetics describes and explains the way in which genes are passed from parents to offspring. It makes specific predictions about the genetic makeup of the offspring compared to the parents. However, in biology, there is usually a great deal of random variation around any one set of predictions, so that the predictions apply only "on average". This means that any one sample of offspring may or may not be close to the average. As the sample size gets larger, then the closer and closer we should get to the prediction. That is, the effect of random variation gets smaller as the sample size gets larger.
However, this is apparently not what Gregor Mendel himself observed when working with his peas (Mendel 1866). The results he reported from his experiments (they are all summarized by Fairbanks and Rytting 2001) seem much closer to the predicted values than would be expected given his sample sizes (i.e. number of pea plants). This was emphasized by Fisher (1936) based on his reconstruction of Mendel's experiments, but it had first been noted by Weldon (1902), who said: "if the experiments were repeated a hundred times, we should expect to get a worse result about 95 times".
To evaluate this claim, a scientist can do one or more of three things. First, they could undertake a mathematical analysis of the situation, which is what Weldon did. Edwards (1986) has made the most thorough evaluation of all of Mendel's data based on statistical theory, and concluded that there really are unusually close fits of the data to the predictions. Second, a scientist could bypass the statistics, and conduct computer simulations of the actual experiments. Novitski (1995) had a go at this for some of Mendel's experiments, and also concluded that there is unusual closeness of the results to the theory.
Third, one could compare Mendel's results to the results achieved by other people who tried to repeat his experiments. As a specific example, Mendel crossed pea plants having seeds that were either yellow or green in color (see the picture above), and reported 6,022 yellow seeds and 2,001 green seeds among the offspring, for a ratio of 3.009:1 when the theory predicts 3:1. This seems remarkably close. For comparison, Sinnott and Dunn (1925) described attempts by six other plant breeders to repeat Mendel’s experiments between 1900 and 1909, reporting a combined total of 134,707 yellow seeds and 44,692 green seeds in the offspring. This is a ratio of 3.014:1, which is slightly worse than Mendel achieved. So, Mendel’s results are the closer of the two to the theoretical expectation, in spite of the fact that the second set of data is based on a sample size that is 22 times larger than his. Almost all of Mendel’s results are like this — consistently closer to the theoretical expectations than his sample sizes warrant (eg. in another experiment he got 5,474 round seeds and 1,850 wrinkled seeds = 2.96:1).
However, this is actually not the worst accusation against Mendel. There is one set of experiments in which Mendel's results are not only unexpectedly close to his claimed expectation but this expectation seems to be wrong. In other words, Mendel is too close to the wrong number!
This argument was first presented by Fisher (1936), although his criticism did not receive much attention until the mid 1960s (the centenary of Mendel's work). For this set of experiments Mendel needed to grow the seeds of the offspring of his cross-fertilizations (F1 in the picture above) in order to test his predictions about heredity. Technically, he had a prediction about how many homozygotes and how many heterozygotes there will be among the F1 offspring — he expected twice as many heterozygotes as homozygotes (this is explained in Wikipedia).
The problem he had was that he could identify the homozygotes correctly but not necessarily the heterozygotes. The latter will produce two types of plants if we grow their seeds (hence their name, heterozygote) while the former will produce only one type of plant (see the next picture). What Mendel had to do was grow the seeds from each F1 offspring and see how many types of plant he got — one or two? If he got two then the original offspring was a heterozygote and if he got only one then it was a homozygote. So, the key to his problem is that he had to find the second type of plant from among his seeds, which he thought would occur only 1/4 of the time (see the picture below). There was therefore a chance that he would miss the second type of plant when checking any one heterozygote, and he would thus wrongly conclude that it was a homozygote.
The problem is this: how many seeds (from each F1 plant) should Mendel check in order to decide how many types of plant he gets (in the F2)?
This is equivalent to asking: how many times should we roll a dice to decide whether it has a 6 on it or not? It we roll it, say, 20 times and do not observe a 6, are we justified in concluding that there isn't one on the dice? Or do we need to roll it 30 times? Or 100 times?
We can evaluate this situation using modern statistical theory; which is fortunate, because this is precisely the sort of thing we need to be able to do when designing scientific experiments. In Mendel's case of looking for a plant that has a 1/4 chance of being observed, the theory indicates what is in this table for a group of 400 plants (which is what Mendel had):
The theory thus indicates that if we wish to make no mistakes (ie. less than 1 offspring plant scored wrongly) then we need a sample size of 22 seeds from each plant. Alternatively, if we will accept a mistake rate of 1% then we could get away with growing only 16 seeds (but we will wrongly record c. 4 plants out of the 400 as being homozygotes not heterozygotes).
Unfortunately, Mendel says he scored 10 seeds per plant. (He probably had an average of about 30 seeds per plant that could have used.) His error rate is thus at least 5%, as shown in the above table. The consequence for his experiments is shown in the next table, which indicates the number of plants that Mendel recorded for each of his experiments. (Note, he had 6 experiments with 100 F1 plants, for each of which he checked 10 seeds.)
Note that the theoretical expectation is that Mendel will record c. 23 offspring incorrectly, by random chance, thus recording only 377 heterozygotes instead of 400. However, Mendel recorded 399, which is very close to his own expectation of 400. (Note that it is clear Mendel expected 400 heterozygotes, because he repeated experiment 5 when he observed what he considered to be an unexpectedly small number of heterozygotes.) This number of 399 is just within the 95% confidence interval for the theoretical expectation, which takes into account how much random variation we can expect (see Wikipedia). Similarly, a 1-tailed t-test of the statistical hypothesis that the discrepancy between the observed and theoretical results of the six experiments is zero yields p=0.054.
So, Mendel's results deviate strongly from the theoretical expectation, but not statistically "too far", at least by the modern convention of p<0.05. Nevertheless, one of Fisher's daughters, Joan Box (1978), remembered her father regarding this deviation as both "abominable" and "shocking", although in his published paper he merely described it as "a serious and almost inexplicable discrepancy".
Fisher (1936) himself considered several possible explanations for this unexpected result, but found none of them satisfactory. Novitski (2004a) and Novitski (2004b) have provided an alternative explanation, but Hartl and Fairbanks (2007) found fault with this. In turn, Hartl and Fairbanks provided yet another alternative, but it does not seem very convincing, either. It seems endless, doesn't it? Other aspects of the so-called "Mendel-Fisher controversy" are discussed in the book by Franklin et al. (2008), who ambitiously claim to be ending the controversy.
Sociologically, it is difficult to know what to make of this situation, especially given Mayr’s (1982) comment that: “the internal evidence as well as everything we know about Mendel’s painstaking and conscientious procedure make it quite evident that no deliberate falsification is involved.”
Moore (1993) emphasizes that Mendel’s publication is, in fact, the unchanged text of two lectures given to the Natural History Society of Brno (in February and March 1865), rather than being a formal presentation of experimental results. The lectures were designed to arouse his audience’s interest in his new method of experimenting with hybrids, as well as presenting his theoretical explanation of the results. (Apparently, in the first lecture he described his observations and experimental results, while in the second he offered his explanation for them.) Consequently, only some of the data are presented, and he could therefore be expected to report as examples those results that fitted the theory most closely (as Mendel himself put it, the experiments he discussed “lead most easily and surely to the goal”). For example, we know from his correspondence that in parts of his experiments with smaller sample sizes Mendel got ratios varying a long way from the predictions.
Possible alternative explanations include: (i) elimination of the more deviant results, due to a belief that the experiments had failed because of interference from foreign pollen; (ii) repetition of particular experiments until the results approached the predicted ratio, without realizing the bias thus introduced; and (iii) a lack of understanding of the importance of the random variation indicated in Mendel’s heredity theory itself, either by Mendel himself or by over-zealous assistants in the monastery garden where he worked. Mayr (1982) refutes this latter one for Mendel himself, pointing out that he had a good training in and understanding of statistical fluctuations (unlike many of his contemporaries).
Rather oddly, Kohn (1986) suggests that Mendel should be forgiven for having overly precise data whatever the explanation, since the unexpected results are at least in the direction of our current genetic theory, rather than being fraudulently incorrect.
Joan Box (1978) R.A. Fisher: the Life of a Scientist. Wiley, New York.
A.W.F. Edwards (1986) Are Mendel’s results really too close? Biological Reviews 61: 295–312.
Daniel J. Fairbanks, Bryce Rytting (2001) Mendelian controversies: a botanical and historical review. American Journal of Botany 88: 737-752.
R.A. Fisher (1936) Has Mendel's work been rediscovered? Annals of Science 1: 115-137.
Allan Franklin, A.W.F. Edwards, Daniel J. Fairbanks, Daniel L. Hartl, Teddy Seidenfeld (2008) Ending the Mendel-Fisher Controversy. University of Pittsburgh Press.
Daniel L. Hartl, Daniel J. Fairbanks (2007) Mud sticks: on the alleged falsification of Mendel’s data. Genetics 175: 975–979.
Alexander Kohn (1986) False Prophets: Fraud and Error in Science and Medicine. Basil Blackwell, Oxford.
Ernst Mayr (1982) The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press, Cambridge MA.
Gregor Mendel (1866) Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereines im Brünn 4: 3–47.
John A. Moore (1993) Science as a Way of Knowing: the Foundations of Modern Biology. Harvard University Press, Cambridge MA.
E. Novitski (2004a) On Fisher's criticism of Mendel's results with the garden pea. Genetics 166: 1133–1136.
Charles E. Novitski (1995) A closer look at some of Mendel’s results. Journal of Heredity 86: 61–66.
Charles E. Novitski (2004b) Revision of Fisher’s analysis of Mendel’s garden pea experiments. Genetics 166: 1139–1140.
Edmund W. Sinnott, L.C. Dunn (1925) Principles of Genetics. McGraw-Hill, New York.
W.F.R. Weldon (1902) Mendel's laws of alternative inheritance. Biometrika 1: 228-254.