The authors, John Mattick and Marcel Dinger of the University of New South Wales, advance various claims to dispute the idea that most of the genome is non-functional, but here I’ll just focus on one:
We also show that polyploidy accounts for the higher than expected genome sizes in some eukaryotes, compounded by variable levels of repetitive sequences of unknown significance.
Uh, yeah. That’s the resolution to the C-value paradox, and it’s one reason why people argue that repetitive sequences, i.e. transposable elements, are, contra claims about ENCODE data, largely non-functional – because their numbers vary greatly between species with a similar biology. As Doolittle writes:
A balance between organism-level selection on nuclear structure and cell size, cell division times and developmental rate, selfish genome-level selection favoring replicative expansion, and (as discussed below) supraorganismal (clade-level) selective processes—as well as drift— must all be taken into account.
Reading into the paper, how is it possible that the following claims by Mattick and Dinger don’t contradict each other?
1. Claims about the non-functionality of the genome are based on a “questionable assumption” of transposable element non-functionality:
…the substantive scientific argument of Graur et al. is based primarily on the apparent lack of sequence conservation of the vast majority (~90%) of the human genome, suggesting that this indicates lack of selective constraint (and therefore function). The fundamental flaw, however, in this argument is that conservation is relative, and its estimation in the human genome is largely based on the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty (Brosius 1999), are largely non-functional.
2. The C-value paradox (or the Onion Test) is not an argument against function in most of the human genome because transposable elements (i.e. repetitive sequences) don’t add genetic complexity:
…more explicitly discussed by Doolittle (Doolittle 2013), is the so-called ‘C-value enigma’, which refers to the fact that some organisms (like some amoebae, onions, some arthropods, and amphibians) have much more DNA per cell than humans, but cannot possibly be more developmentally or cognitively complex, implying that eukaryotic genomes can and do carry varying amounts of unnecessary baggage. That may be so, but the extent of such baggage in humans is unknown. However, where data is available, these upward exceptions appear to be due to polyploidy and/or varying transposon loads (of uncertain biological relevance), rather than an absolute increase in genetic complexity (Taft et al. 2007).
Finally, whenever you read that developmental complexity correlates with genome size, run for the hills:
Moreover, there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known…
Definitions of upward and downward exceptions seem to be a bit circular in this piece, and anyway, Ford Doolittle provides an example of a downward exception (puffer fish with its 400 mb genome). On the more general point of the relationship of developmental complexity and genome size, I’ll refer you to Ryan Gregory’s discussion of the issue (be sure to follow the links therein). Finally, where in this paper is any discussion of the appropriate null hypothesis?
UPDATE: For another exceptionally small genome, go read about the carnivorous bladderwort.
UPDATE 2: Larry Moran has a much more detailed dissection of this paper over at Sandwalk – don’t miss it.