More on Beer-Drinking Scientists: A Response to Dr. Grim

In my last two posts on this blog, I commented on a recent scientific publication:

Tomáš Grim, “A possible role of social activity to explain differences in publication output among ecologists”, Oikos, OnlineEarly Articles, 8-Feb-2008.

The article found a correlation between the weekly drinking habits of Czech avian ecologists and their scientific publication output. I was pleased to see that Dr. Grim, the author of this study, commented on both of my posts. Here, I will address most of his comments and give a further critique of his paper.

I complained in my first post that correlation does not imply causation. Dr. Grim correctly pointed out that the word “‘causes’ or ‘causation’ does not appear in my paper. I am not that stupid.” He then suggested that I “read more carefully, please.” But I have read his paper very carefully. And while Dr. Grim is most definitely smart and refrained from using the term ‘causation’, his theory of causation is evident throughout the paper. He states in his paper that “human cognitive performance during and after drinking is decreased”, that it “significantly decrease[s] cooperativeness” and the “effects of alcohol use are well known to decrease mental and working performance in general”. Dr. Grim offers no other theory to explain the observed correlation (even though there are many), and in fact his idea of causation is evident from the title of his paper. His recent comments on this blog also confirm his adherence to one causation theory, adding that hangovers impede productivity (even though his survey provided no evidence of drunkenness or hangovers on the part of the participants). Thus, my complaint that Dr. Grim’s observed correlation is insufficient evidence to support his theory of causation is entirely relevant, despite Dr. Grim’s protests.

Dr. Grim criticized my speculations about the nature of the raw data for his study, saying that I “just do not know” and that my speculations were wrong. I accept these criticisms. A bigger complaint, however, is that I must speculate at all. Dr. Grim does not provide any of the raw data for his study, instead showing only transformed data (without publishing transformation parameters). He provides virtually no descriptions of the data either – no means or ranges of either publication rates or drinking habits. As a result, it is impossible to determine the importance of the resulting correlation: does drinking at twice the mean rate correlate with a 5% decrease in scientific output, or a 50% decrease? In other words, how big is the effect? To my mind, that would seem important – and in fact the whole point. Unfortunately, readers are given no choice but to speculate (and possibly to hear Dr. Grim tell them how much they don’t know). Dr. Grim’s defense to this criticism – that a promise of confidentially requires that he keep the data secret – simply points out a basic flaw in the design of his study. His choice to survey only his fellow Czech avian ecologists may have made for good barroom banter among his buddies, but did nothing to benefit science.

I thank Dr. Grim for clearing up my misunderstanding about the number of participants in the 2006 study (34 rather than 16). Figure 1 in his paper labels 18 data points as “included in the first survey 2002” and as “past”, while 16 data points are labeled “included in 2006” and “present”. From such labeling it is easy to misinterpret the relationship between the data points in the two surveys. In fact, I’m still confused. Are the 34 data points in Figure 1 all from the 2006 survey? What does “past” with n = 18 and “present” with n = 16 refer to? Are the 34 total data points for 34 different people, but some surveyed in 2002 and others in 2006? I’m trying, but I just can’t “[s]ee the bloody Fig. 1 properly, man!” Perhaps a more complete explanation would have made the paper clearer.

I described the correlation coefficients for his 2002 and 2006 survey results as “barely large enough to claim with 95% confidence that the results are statistically different from an R^2 of 0 (no correlation)”, though my confusion as to the number of data points in the 2006 study would seem to invalidate my analysis for that correlation coefficient. In somewhat colorful language, Dr. Grim called into question my statistical abilities. His advice to me: “PLEASE, read some statistical textbook first”. While I certainly don’t consider myself a leading expert in statistical analysis, I have read a statistical textbook or two over the years. In fact, I read quite a few in 2006 when I taught a graduate-level statistics course at the University of Notre Dame. I recall teaching my students about the Student’s t-test to compare the means of two normally distributed populations each having the same estimated variance, and about Fisher’s z-transform to transform a distribution of correlation coefficients into something approximately normal, so that the Student’s t-test can be used. For a sample size of 18, the R^2 must be greater than 0.25 to be different from 0 with 95% confidence. The 2002 study showed values of R^2 just barely larger than this, as I had stated previously.

I also recall teaching my students that statistics are often the last refuge of the mediocre scientist. When systematic error (bias in the data) is greater than random error, the use of statistics is a waste of time. Worse, in that situation the use of statistical analysis can give a false sense that one’s conclusions are scientifically rigorous. So while an R^2 of 0.5 and a sample size of 34 may be statistically significant in the absence of bias in the data, one must do more than enter numbers into JMP if one wishes to understand whether the correlation is scientifically significant. So let’s look at Dr. Grim’s study a little more closely to see whether it is likely to be free of significant bias.

For a study such as Dr. Grim’s, there are a number of opportunities to introduce bias into the data. Here I’ll address the three most important:

1. Survey bias. As anyone who has seen political parties trot out competing polls showing significantly different results knows, it is easy to bias survey results by the way surveys are written or administered. Does Dr. Grim have experience in the proper design and administration of human behavior surveys? Was his survey vetted by other scientists with such experience? Considering the fact that many (if not most) of the survey participants were likely friends or colleagues of Dr. Grim, and that the subject matter of the survey (social drinking) is a sensitive topic on both personal and professional levels, the possibility for bias in the responses is extreme. It doesn’t take an expert in human behavioral research to see that his survey does not pass the smell test. (As an aside, a typical ethical requirement for human behavioral studies would prohibit the use of subjects with any significant personal or professional ties to the researchers, especially if those ties are not disclosed.)
2. Sampling bias. When one cannot measure an entire population, one must sample that population and draw inferences about the population from the sample. If the sample is not representative of the population, sampling bias can invalidate any conclusions drawn from the study. So what was the population under study by Dr. Grim? In his comments on this blog, Dr. Grim claims that the population under study were Czech avian ecologists (of which, according to Dr. Grim, there are 38). If this were the case, then Dr. Grim’s study would be interesting to, oh, about 38 people. Hardly the kind of thing that gets coverage in the New York Times. In fact, Dr. Grim’s paper implies a much greater population is at play. The title of the paper claims application to all ecologists, and the text of the paper contains more than a dozen references to science, scientists, and scientific productivity unqualified by the Czech avian ecologist subgroup. The abstract of the paper discusses science and scientists in general, without reference to a narrower population. His hypothesis under test is clearly stated in the abstract: “I predicted negative correlations between beer consumption and several measures of scientific performance.” The penultimate paragraph of the paper is filled with wide-ranging conjectures about the implications of this study to the lives and careers of scientists (including the bizarre speculation that social drinking may impact the “biological success” of scientists as well). Dr. Grim’s statement that the population under study is limited to Czech researchers studying avian evolutionary biology and behavioral ecology is simply disingenuous. His sample was limited to Czech avian ecologists, and is thus horrendously biased compared to the population of all ecologists or all scientists. Simply put, if his population is all Czech avian ecologists, then his study is essentially worthless to the greater population of scientists, and if his population is the greater population of scientists, then his study is essentially worthless due to extreme sampling bias. Either way, the conclusion is the same.
3. Confounding factors. When factors outside of the control of the researcher have significant influence on the resulting correlation, even a statistically justified correlation may lead to no insight as to cause and effect (and in fact may mislead). Did Dr. Grim’s study identify all of the significant confounding factors? This is always one of the most difficult questions facing the researcher of human behavior.

There are other criticisms of this study as well, such as the lack of independence of the data points (thus invalidating essentially all of Dr. Grim’s statistical analysis), and of course the use of a very small sample size to draw sweeping conclusions about human behavior. Dr. Grim’s defense that a “sample of that magnitude (34) is quite good for an ecological study” is of little importance since this was not an ecological study but a human behavior study, involving extremely complex human behavior at that. His final fallback, that “[e]ven just stating the logic of a hypothesis without any empirical data is worth publishing”, is more than a stretch. I think a strategy of trying to publish hypotheses without any empirical data would have a bigger impact on a scientist’s journal publication rate than beer drinking ever could.

But all of these criticisms boil down to one simple fact: Dr. Grim surveyed a small number of fellow Czech ornithologists in what was probably great fun, but definitely bad science. Dr. Grim himself described his paper as “half-joke-half-study”. Now, I’m the first person to appreciate a good joke, especially at the expense of us scientists. However, Dr. Grim failed to mention the “half-joke” part of the equation in his published paper, preferring instead to pass off his bit of fun as real science. As such, publication of his paper in a scientific journal borders on the unethical. Fortunately for Dr. Grim, he chose to publish his paper in an ecology journal, rather than one specializing in human behavioral studies, where it is likely that his reviewers would have been less kind. Unfortunately for Dr. Grim, he must now own up to his joke or face digging an even deeper hole for himself.

7 thoughts on “More on Beer-Drinking Scientists: A Response to Dr. Grim”

  1. In spite of the fact that Chris taught a course in statistics (I have had some poor stats instructors), I can personally vouch for his abilities in the field.

    It seems to me that the two of you are critiquing past each other. It is not yet a full dialog, merely a sequence of monologues.

    I suggest that you, Chris, meet face to face with Dr. Grimm. (BTW, I will be in Prague in early August.)

  2. I know Chris, and his knowledge, skills, and abilities, quite well. Advice from Dr. Grim to Chris Mack of “PLEASE, read some statistical textbook first” is embarrassing to Dr. Grim. Chris is one of the most well-respected, if not THE most well-respected, individual(s) in his field – and wile this is not specifically statistice, suffice to say Chris is knwoledgable of this field as well. And to alleviate any need for folks to inquire as to how I know of Chris’ knowledge, skills, and abilities, we have been professional colleagues for years. Lets be thankful we have such a knowledgeable person and great mind amongst us.

    Later,

    Pat Mercado

  3. Chris is taking the "half science" part of the study very seriously, and apparently disregarding the "half joke" part. I enjoyed the discussion when it was about the science half of a joking study, using it to illustrate some common stats errors/limitations. Now it just sounds mean and petty, and there’s no fun in that.

    I hope the parties can return to the original light-hearted discussion.

  4. I suggest that a critical element of the discussion has actually been missing. I think you should both continue the discussion while drinking beer together, and that sooner or later you will find that you agree on almost everything, and the fun will return.

  5. You may be interested in these (both free online):
    Sheil D., Wunder S., Jansen P., Bongers F. & Dudley R. 2008: Hope for Bohemian ecologists – comments on „ A possible role of social activity to explain differences in publication output among ecologists?” by Tomáš Grim, Oikos 2008.
    Web Ecology 8: 103–105.

    Moya-Larano J. 2008: A break to moderate drinkers. Web Ecology 8: 106–107.

  6. person (s) in your field – and this is not specifically cunning statistic, just say Chris is knwoledgable this field. And to alleviate any need for people to ask how I know Chris’ knowledge, skills and abilities, we have been colleagues for years. Let us be thankful we have a person with knowledge and great minds among us.

Leave a Reply

Your email address will not be published. Required fields are marked *