Lab notebooks are one of the less glamorous parts of being a scientist. You must meticulously record what you do each day so that some day in the future, someone could read it and replicate that day’s work. Or when you realize you discovered something you would like to patent, you must prove that you indeed thought of it on a particular day.
Investigative science writing like this isn’t unique — but it’s a lot more rare than it should be…it’s expensive and time consuming. And more and more often, it’s becoming an unavailable option to news organizations looking to cut costs…In late March, I issued a broad-based call for what I called “nightmare documents,” the sorts of opaque public records that can be a real pain for journalists trying to use them in their reporting…Impossible-to-analyze databases. Government records hidden behind clunky Web interfaces. Unsearchable public reports digitized on ancient scanners.
I’ve encountered the same problem, not as a journalist, but as a researcher – datasets that are “shared” or “publicly available” that are almost unusable due to poor formatting and annotation. Although many journals require datasets to be made available, the requirements for useful formatting and annotation, even at public data repository sites, are usually laughable. And, most busy researchers can only be bothered to meet those minimal standards (eg, “Do you think that is good enough for them to let us publish? Cause I got a grant due.”).
I am happy to say that this is an issue of which Open Data advocates are well aware and are taking concrete steps to address.
*We say nice things about people who want to interview us; and by “us” I mean “me”. Mike says positively horrid things about everyone he talks to.