So I take it you aren’t happy with ENCODE…

Mike is very busy being an awesome scientist. So, I have the duty of reacting to the latest “ENCODE takedown” published by Graur et al in Genome Biology and Evolution: “On the immortality of television sets: ‘function’ in the human genome according to the evolution-free gospel of ENCODE”. The title kind of tells you that the ENCODE consortium has a snowball’s chance in Hell of coming out of this one looking good – not that the paper was written by unbiased critics.

ENCODE made a big mistake when they equated “biochemical activity in some assay” with “biological function”. Similarly, Graur et al make a big mistake in equating conservation with function. The fact that less than 10% of the human genome shows evidence of purifying selection (selection that keeps a DNA sequence the same) is not definitive proof that 80% of the genome is not functional, without having to throw out evolutionary theory. While ENCODE’s 80% functional sequence number is ridiculous, genome sequences can have biological function with no conservation for a variety of reasons. Ones that leap to mind include, but are not limited to: function with fitness effects that fall below the threshold for efficient selection in humans, novel functional variation, or the fact that function and conservation are not binary characters, but exist on a spectrum – to name a few.

In essence, there is plenty of gray area between their definitions of “causal role” function and “selected effect” function. “Selected effect” is indeed a conservative way to assign “function”. It is so conservative that it can easily label sequences with physiological consequences (not necessarily fitness consequences) as non-functional. This is the inverse extreme of ENCODE, which makes the error of applying the “causal role” function definition so liberally that their conclusions could only be treated seriously thanks to the combined gravitas of the ENCODE consortium members.

In the body of the paper, Graur et al are actually relatively balanced about these points and address most of the issues I raised in the two prior paragraphs. They are correct about many of the mistakes made in the ENCODE analysis regarding the amount of the genome that is “functional”. You might not notice their good points due to the hostile presentation. If you already agree with the authors, you might enjoy the spilt bile, but you already agree and hardly need to be educated on the failings of ENCODE’s initial analysis, now do you?

This paper is less a reflection of the mistakes made by the ENCODE project and more about the deep rifts in the research community, especially amongst genomics researchers. Emphasis on big projects (eg, human genome, ENCODE, brain mapping) or big prizes to already established researchers, leaves small, independent researchers pursuing creative questions feeling left out in the cold. Frankly, in genomics, you cannot compete with the resources these consortia and genome sequence centers can bring to bear on a project

I also think that, while the fields of marketing and public relations are overly reliant on pseudoscience, Graur et al are misrepresenting those fields by implying that the ENCODE media blitz was designed using their best practices.

The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten.

Indeed, the biting critique of ENCODE’s failures of presentation is a bit ironic given Graur et al’s counterproductive rhetoric.

Author: Josh Witten

22 thoughts on “So I take it you aren’t happy with ENCODE…”

  1. While I’m no fan of using controversy-type rethoric, it is certain that we wouldn’t be discussing about the Encode claims if Graur et al. had been a “civilised” critique.

    1. We can be certain of no such thing. Indeed, the validity of the particular ENCODE claim was discussed avidly from the moment ENCODE published. These discussions occurred at multiple levels of civility & with lots of publicity.

      Even if true, the story isn’t about ENCODE’s junk DNA claim, it’s about scientists squabbling, which is not likely to get us more big science or little science.

  2. Very nice article, Josh, you are able to step out and beyond. I feel that the ENCODE authors are well aware of the oversimplifications in the ‘big finding’ conclusions, but this is what the Nature letter format demands: have a linear story line with a single ‘revolutionary’ finding. And we all know neither life, nor world, nor our genome is like that…

    1. Thinking that the tone of the critique is inappropriate is not to condone the mistakes in presentation by the ENCODE consortium or endorse the publication/funding system that incentivized their approach.

      The point of course is not that we should dismiss the critique because it is uncivil. Its lack of civility is unhelpful. The presentation in this critique and surrounding media, such as questioning whether the ENCODE consortium members are “real scientists”, conflates a problem with a specific, oversimplified claim with the quality of the data they produced. The junk DNA claim is junk. The data set is not.

    2. “But this is what the Nature letter demands” is no argument. If you do not have a message that fits into the Nature letter requirements, do not send it there,

      Martijn Huynen

  3. I was rather surprised that the referees and editors of Gen Biol Evol permitted the use of the kind of language used by Graur et al in their otherwise interesting and valuable critique of some claims from the ENCODE team.

    To some extent the latter were already something of a straw man in that the absurdity of the 80% ‘functionality’ figure had already been pointed out by many other researchers.

    While the overly hostile tone, bordering in places on ad hominen attacks (not to mention several uncorrected spelling mistakes) in the Graur et al paper may have grabbed media attention ( the continued use practice can only debase scientific discourse. Maybe I am old fashioned but if I want to read rhetorical opinionated articles I look at blogs but if I want to read about measured evidence-based research I look at scientific journals.

    Overall I welcome the Graur et al paper and agree with most of its conclusions but the tone of their argument still leaves a nasty taste and may lead some to sympathise with those being attacked in such a forthright and perhaps inappropriate manner.

    On the broader topic of big versus little science, we should recognise that none of us can progress without access to reliable data – and in genomics this is now mostly (but not always) provided by large projects that lodge data in public databases. Personally, as a ‘small scientist’, I could not function without access to the public databases and projects like ENCODE do provide a valuable service to the community. However, I also agree that in many cases these consortia (that’s a plural word folks) do not necessarily have the broad range of biological skills required to interpret their data or to move them up the DIKW hierarchy in a convincing manner. As a biologist I work with mathematicians and informatics specialists in a small international team – along with hundreds of other ‘small science’ groups around the world. Just as I could not do without my non-biological colleagues (and vice versa) the ‘big science’ consortia cannot realistically progress beyond mere data generation without the remainder of the scientific community – most of which is still thankfully made up of small science groups.

    Finally, I’d like to thank all concerned in this controversy as it has provided some great material for our next research seminar and for some of my undergraduate lectures on genomics!

    1. Great points, Denis. I’m with you in not supporting the dichotomy between big/small science. There is a place & need for both. I think, moving forward, we need to figure out more clever ways to integrate the ideas and creativity of small, independent research groups with the technical capacity for data production of Big Science projects. When the first ENCODE splash was being made, I wrote a post on what I think could be a productive route, based on approach taken (or my perception of the approach taken) by the astronomy community:

  4. I am much more concerned and offended by misleading hype science (a la ENCODE, Darwinius, XMRV, and arsenic life) than I am about a lack decorum in the published literature and I think that anyone engaging in the former deserves to get their knuckles rapped. Hard. It is fundamentally anti-thetical to good science and good scholarship to misrepresent science to the public. This (rightly) errodes the trustworthiness of science and the scientific community and undermines the human endeavor of science itself. It is a big problem when most papers that get media play are wrong and I don’t think the a tolerance for misleading hype is going to do anything but make that worse.

    That said, decorum is good and we should, generally keep it up. Generally.

    1. MattK, the issue with Graur et al’s rhetoric is not an attempt to protect decorum or not chastise the ENCODE consortium (or anyone else) for their mistakes. The point is that the rhetoric makes a relatively accurate critique LESS effective. For all the attention this has garnered, the attention is focused on scientists have a spat, not on the science. This does nothing to make scientists look less biased and trustworthy.

  5. “Similarly, Graur et al make a big mistake in equating conservation with function.”

    No, they don’t. That’s because they don’t do this at all. They admit on page 9: “Estimates of functionality based on conservation are likely to be, well, conservative.”

    “While ENCODE’s 80% functional sequence number is ridiculous, genome sequences can have biological function with no conservation for a variety of reasons.”

    They acknowledge a large chunk of this on page 32. For example, they refer to sequences that serve a function in a non-sequence dependent way as “indifferent DNA”.

    “I also think that, while the fields of marketing and public relations are overly reliant on pseudoscience, Graur et al are misrepresenting those fields by implying that the ENCODE media blitz was designed using their best practices.”

    This isn’t obvious to me. The ENCODE researchers chose the 80% figure and chose to trump the “junk DNA is dead” line.

    1. Taylor, you will notice that I actually raised your points about the complexity of Graur et al’s argument in the fourth paragraph, substantially less buried than the relevant sections in Graur et al. The arguments presented in the most obvious materials & in interactions with the media (albeit with claims of being misquoted) are of the unsubtle variety. It is the presentation, not the merits, that make me worry.

      You will also notice that I make a point of saying I think that Graur et al. are technically correct about this specific claim and you will find that I’m a big fan of junk DNA.

      As to your last point, my comment was responding to Graur et al. insulting professional marketers and publicists by implying the ENCODE media launch was a textbook PR campaign. It is apparently surprising to scientists that the practitioners of jobs we general do not respect are not themselves self-loathing.

      1. Ah, yes––then it appears I misinterpreted your line about marking and public relations. My bad.

  6. A bit late, but it’s worth noting that Graur’s example of ‘adding 300 grams body weight’ as causal role function for the heart is a bad analogy for DNA. Graur is quoted makeing an even worse analogy to chewing gum in this Popular Science piece:

    The point is, while it’s true that these are “functions” in the sense that they’re doing something, the thing they are doing is not necessarily meaningful.

    Here is how Graur explained it on the phone: “Have you ever stepped on a piece of chewing gum? It binds to the sole of your shoe. But this is not the function of chewing gum, to bind to the shoe on a hot day.”

    But when it comes to DNA binding transcription factors, the non-selected, causal role function of some DNA segments is identical to selected function of other DNA segments. A more accurate chewing gum analogy would be this: there are substances that are deliberately designed to stick to your shoe, and there are some that just happen to stick to your shoe, and you need to figure out which is which.

  7. I would like to add that even if a transcription factor is bound to a stretch of DNA and it doesn’t lead to transcription, that particular protein is still occupied and cannot do it’s work elsewhere. So therefore that stretch might still affect transcription, if only slightly.

    1. Technically true. The effect will depend on the number of copies of the transcription factor. Alternatively, feedback mechanisms in transcriptional regulation may accomodate such effects, thus making occupation at a site (or not if site is lost) only transiently relevant to transcription. It does not, however, seem that ENCODE was thinking of anything so nuanced and complex.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: