An ode to junk

It is an unfortunate circumstance that ENCODE publicity decided to declare “junk DNA” dead, again. It’s not a totally unique position. Creationists and John Mattick have argued that there is no useless DNA for ages.

The demise of “junk DNA” is a fait accompli of the way “functional” is defined. It is not a definition of “functional” most of us would recognize. Ewan Birney, who should know, explains that when ENCODE says “functional” they mean “not biochemically inert in at least one of our many assays*”. As Mike has noted from his own research experience, many totally random DNA sequences synthesized in a tube are “not biochemically inert” nor are they biologically “functional”.

The fact is, if you only think of “junk DNA” as a problem, you aren’t seeing the forest for the trees – and you certainly are lacking a touch of poetry in your bleak soul. Continue reading “An ode to junk”

Random Genome, Naked Genome

On Saturday, my former Center for Genome Sciences colleague Sean Eddy brought up the idea of a Random Genome Project: let’s create a random genome to serve as a null model of genome function. With this random genome, we can determine how much supposedly functional biochemical activity do we expect to see just by chance, and, among other things, we might use a random genome to explore how new functions evolve by “repurposing” (Eddy’s great term) non-functional DNA. In the comments to that post, you can read some discussion of how you might go about making a random genome.

An easier task would be to implement the random genome computationally, an idea I’ve been exploring recently, using a genome-wide binding model along the lines of the one by Wasson and Hartemink.

Why do this? Because we could explore two kinds of null models – the random genome described by Sean Eddy, and the naked genome. Continue reading “Random Genome, Naked Genome”

Quote of the day: Doolittle and Sapienza on selfish DNA

From “Selfish genes, the phenotype paradigm, and genome evolution,” W. Ford Doolittle & Carmen Sapiena, Nature 284:601-3 (1980), here is one of the original definitions of selfish DNA:

Non-phenotypic selection

What we propose here is that there are classes of DNA for which a ‘different kind of explanation’ may well be required. Natural selection does not operate on DNA only through organismal phenotype. Cells themselves are environments in which DNA sequences can replicate, mutate, and so evolve. Although DNA sequences which contribute to organismal phenotypic fitness or evolutionary adaptability indirectly increase their own chances of preservation, and may be maintained by classical phenotypic selection, the only selection pressure which DNAs experience directly is the pressure to survive within cells. If there are ways in which mutation can increase the probability of survival within these cells without effect on the organismal phenotype, then sequences whose only ‘function’ is self-preservation will inevitably arise and be maintained by what we call ‘non-phenotypic selection’. Furthermore, if it can be shown that a given gene (region of DNA) or class of genes (regions) has evolved a strategy which increases its probability of survival within cells, then no additional (phenotypic) explanation for its origin or continued existence is required.

The truly provocative and disturbing stuff in ENCODE

… at least from my perspective. I’ll now stop ranting about the hype and media coverage of ENOCDE, and extend my compliments to the consortium for an amazingly well-coordinated effort to achieve an impressive level of consistency and quality for such a large consortium. Whatever else you might want to say about the idea of ENCODE, you cannot say that ENCODE was poorly executed.

It’s time to get into the interesting stuff – what’s actually in the papers. Among the results I’ve been most eagerly awaiting to see in print are the DNase hypersensitivity results now published in Thurman et al. (Nature 489, 75–82 (06 September 2012) doi:10.1038/nature11232)

Why is this interesting? Because it raises provocative and possibly disturbing questions regarding how transcription factors navigate and read out information from the genome. Continue reading “The truly provocative and disturbing stuff in ENCODE”

Polling junk DNA

I missed this poll by Chris Gunter yesterday, asking “If you are a non-genomicist, can you tell us if you thought/were taught much of the genome was “junk”?

Well, I’m 1) a day late and 2) not a non-genomicist, but I’ll reply anyway, because we need a little history review.

In my Eukaryotic Genomes course in grad school (in the year the draft Human Genome sequence came out), I was taught by Tom Eickbush, not so much about ‘junk DNA’, but about ‘selfish DNA’. The point is largely the same regardless of what we call it. Among the first papers we read in Eickbush’s class were the classic Doolittle and Sapienza and Orgel and Crick papers on selfish DNA.

The key argument of these papers was this: parasitic DNA that can replicate itself within the genome requires no other explanation for its existence other than is ability to replicate, period. It does not need to be functional, from the perspective of the organism. It may acquire a useful function. But in general, absent evidence of such a useful function, we don’t need to ask the question, ‘what is the function of this DNA?’ There’s no mystery why it’s there – because it can replicate. Continue reading “Polling junk DNA”