Online Research: Narrowing the Possibilities?

I want to take a momentary detour from interstellar topics to talk about how we go about doing research, astronomical and otherwise. Some years back I debated the then new trend of online peer review with an opponent who argued for the virtues of traditional print journals and their methods. At the time, what would become the arXiv pre-print site was just beginning to grow, and the benefits of having a wide audience able to examine a scientific paper before it achieved print seemed manifest. Much good research, I reasoned, would become available for scrutiny, some of it unable to get past academic referees at a specific journal but now able to be included in a broadened scientific discussion.

Even so, certain trends did worry me, some of them now manifest again in a presidential report recently cited by James Evans, a University of Chicago sociologist. The report makes a jaw-dropping claim: “All citizens anywhere anytime can use any Internet-connected digital device to search all of human knowledge.” The sheer naiveté of this claim boggles the mind, the idea that the Internet, whose holdings are top-heavy with the most recent work and all but empty of the great bulk of earlier studies other than in the form of bibliographical references, is a complete library.

Evans agrees. We have no reason to doubt (and surveys of library practice confirm) that the use of print is waning because of the manifest advantages of searching online, not to mention exotica like citing going forward, meaning an earlier paper’s references can now be buttressed with links to subsequent research that refers back to that paper, thus deepening the perspective. Interested in learning more, Evans has published the results of his survey of a database of 34 million articles, with reference to their availability and the uses to which they are being put. This is from an essay he did on the Britannica Blog about his work, and now the implications of Web availability take a darker turn:

“…as more journals and articles came online, the actual number of them cited in research decreased, and those that were cited tended to be of more recent vintage. This proved true for virtually all fields of science. (Note that this is not a historical trend… there are more authors and universities citing more and older articles every year, but when journals go online, references become more shallow and narrow than they would have been had they not gone online).

And was my idea of spreading the availability of good material outside the primary journals accurate? Apparently not, at least in sociology. For Evans also learned that researcher attention has now shifted to the most prestigious journals. The result turns out to be counter-intuitively hostile to good research: With online searching more efficient and aided by hyperlinking, what we’re actually seeing is a narrowing of the range of scholarly findings and ideas being studied by scholars. And get this:

Ironically, my research suggests that one of the chief values of print library research is its poor indexing. Poor indexing—indexing by titles and authors, primarily within journals—likely had the unintended consequence of actually helping the integration of science and scholarship. By drawing researchers into a wider array of articles, print browsing and perusal may have facilitated broader comparisons and scholarship.

And, of course, we can relate this to the non-academic experience of the average Internet user, who may find that while access to a wide range of ideas is available, the actual practice is to look at the top page of search results and little else. With Google’s page-rank algorithms making the call, people wind up experiencing largely the same number of high-profile sites, to the detriment of serendipity, that wonderful process by which we blunder into a concept that cross-pollinates into a startling new insight.

Long live the computerized database and the pre-print server concept, but can’t we work on richer indexing methods and interface possibilities to keep the research environment as fertile as possible? Evans is exploring this in his work, and speculating that advances in natural language processing may help us sharpen up the relevance of our search techniques. Beyond that, of course, we have to expand our databases themselves to include the vast storehouse of papers that have accumulated over the course of scientific investigation, many of which, when coupled with recent findings, may offer insights that would otherwise be lost. This is a future priority for the Tau Zero Foundation.

The paper is Evans, “Electronic Publication and the Narrowing of Science and Scholarship,” Science Vol. 321 No. 5887 (18 July 2008), pp. 395-399 (abstract).

Addendum: Author Nicholas Carr also looks at this issue in his Rough Type blog, from which this:

When the efficiency ethic moves from the realm of goods production to the realm of intellectual exploration, as it is doing with the Net, we shouldn’t be surprised to find a narrowing rather than a broadening of the field of study. Search engines, after all, are popularity engines that concentrate attention rather than expanding it, and, as Evans notes, efficiency amplifies our native laziness.

  • Adam Crowl August 14, 2008, 6:27

    Hi Paul

    One thing which annoys me about Net research is the sheer inexcusable ignorance of another person’s viewpoint – there’s so much online that a decent search will reasonably inform anyone on both sides of an issue (or all sides when there’s more than one.) Take arguments over evolution versus creationism versus Intelligent Design ad nauseum. There’s simply no excuse for creationists to keep dragging out hackneyed “lies oft repeated” as supposed evidence. Nor is it sufficient for evolutionists to claim that creationists are all alike, or that they all believe the same thing. They aren’t and they don’t. There’s a dazzling array of opinions, of lesser to greater degrees of rationality, on both sides.

    As for historical depth, there’s papers (or at least abstracts) available in astrophysics going back to very early last century – original papers by Eddington and Chandrasekhar, for example. Or Kuiper’s discovery paper of methane on Titan (and hints on Triton), or his discovery of Nereid. Edgeworth, Opik and Oort’s papers on the “Edgeworth-Kuiper Belt” or the “Opik-Oort Cloud” and so forth. Always pays to go back to the originals. But how many people even realise that these people wrote on many other topics, that Oort studied intergalactic astronomy, or that Opik was based in Ireland for years, not his native Estonia. One of the most galling examples of temporal shallowness is that people let that idiot Hoagland claim he first theorised Europa had a frozen ocean. John Lewis had made such predictions in the early 1970s, revising them as solid-state creep physics become better known. Now we still don’t know if it’s convecting ice or an actual ocean – it’s still too hard to know until we look.

    Shallowness of perspective is a grade school habit that needs to be extirpated on graduation…

  • Adam Crowl August 14, 2008, 6:30

    …and if you caught my goof ;-)

  • Zen Blade August 15, 2008, 12:34

    As I am finishing up my dissertation, I have been aware of this fact. However, I am not certain what the best course of action is.

    The fact is science is not simply expanding, not simply becoming more complex, but it is also becoming more intricate and time-consuming in order to publish and to make additional discoveries. The question becomes, in a large sense, how does a scientist decide what techniques, what assays, what literature to prioritize and learn when there are constantly newer techniques, assays, and literature being presented—and the rate of expansion seems to only be increasing.

    For students in their mid 20’s (in the biological sciences) it is almost unrealistic to expect them to head to the library to research their projects. Yes, the student may GO TO the library (and use their computer/internet resources), but many journals have articles online for any issue 10-20 years old (or even older for some). If you can find 5 articles, read the abstracts, and figure out the main experiments of those 5 articles in 15 minutes using the internet, why would you spend 1-2 hours attempting to do the same thing with print sources… The only reason would be if you knew (before starting the search) that you were not going to find what you were looking for online…
    Every hour or two you aren’t wasting your time searching for articles is an extra hour or two that can be used writing grants (which take forever), planning experiments, conducting research, teaching/training people, attending seminars, etc… . I should also add that the number of computer programs/algorithms out there that assist a person in researching only add to the enticement to stay virtual; or to dismiss that which is not.

    The same phenomenon applies to textbooks. In the past I regularly used my biochemistry books to double check a number of basic facts– characteristics of amino acids, nucleic acids, elements, etc…–I still have those pages bookmarked. However, my first action now is always to head to wikipedia. I type in what I need, and the information is concisely presented, confirming what I thought… or, if something seems weird, I can quickly double check somewhere else–perhaps a book, but they are just as likely to be wrong. This works well for basic, concrete facts. What this doesn’t work well for is research or larger ideas/concepts. Wikipedia commonly cites only the most recent scholarly works (even if the works are wrong or incomplete).

    Anyways, I just wanted to add my personal experience which would tend to say that efficiency is good… but I acknowledge that my knowledge of papers from the 1950’s, 1960’s, and 1970’s is not particularly good. In the drive for today’s science, we forget yesterday’s.

    -Zen Blade

  • Administrator August 18, 2008, 16:44

    Excellent comments, Zen Blade. We can hope, of course, that reports like the one discussed in the article will spur further efforts to digitize older materials and make them available, broadening the context for everyone. It is concerning, though, that the focus seems to be shifting, as per the paper, to the more prestigious journals while increasingly neglecting the smaller ones. I would have expected a much different result in the Internet context.

