I sometimes rely on nudges from my software to remind me of directions I’ve been meaning to take in a Centauri Dreams article. Seeing that Caleb Scharf has a new book out (The Ascent of Information), I was setting about ordering it when I noticed how many notes I had on my hard disk related to Scharf’s work, a reminder of how provocative I find his writings. That took me back to a 2018 article called The Selfish Dataome, and also to the recent article The Origin of Technosignatures, which appeared a few days ago in Scientific American.
Scharf (Columbia University) has the habit of asking questions no one else seems to have thought of. So let’s kick this around a bit. The notion of a ‘dataome’ is about external things that a species generates. Scharf defines it as:
a deeper way to quantify intelligent life, based on the external information that a species generates, utilizes, propagates and encodes in what we call technology—everything from cave paintings and books to flash drives and cloud servers and the structures sustaining them.
Here we go beyond biology to ask why technology comes to exist in the first place. But this gets into some deep philosophizing that is beyond my pay grade, so I’ll pause to look at the numbers in our current dataome, which are staggering. They inspire in me that punchy effect I can feel when contemplating galaxies full of stars and planets. In 2018, according to Scharf, we generated 2.5 quintillion bytes of data a day or — I like this — a billion billion bytes for every planetary rotation. Much of that data hangs around in our daily lives.
Think of YouTube’s holdings, for example, or the GIFs you occasionally get from your friends, the scientific papers we keep talking about in these pages, the emails that pester your bulging mailbox, the albums of photos of the kids in the family room, the collection of black and white movies on ancient VCR tapes (well, that’s my collection, but I assume you have something similar). This is all folded into a dataome, which to Scharf is analogous to a genome, and one that may, as per Richard Dawkins, somehow perpetuate itself. What compels us, in other words, to keep all these things?
To address the question in his older article, Scharf in 2018 looked at the writings of William Shakespeare. You’d think these would be easy to define: The gorgeous sonnets, the 37 plays, the 835,997 words comprising the complete works (with a small handful whose authorship is disputed). But the question is how all this has propagated over the centuries. Two to four billion physical copies, by Scharf’s estimates, of the works, meaning hundreds of billions of sheets of paper covered by more than a quadrillion letters, have been produced. All of this involves energy production, even in reading.
Thus Scharf on energy use:
Across time these billions of volumes have been physically lifted and transported, dropped and picked up, held by hand, or hoisted onto bookshelves. Each individual motion has involved a small expenditure of energy, maybe a few Joules. But that has added up across the centuries. It’s possible that altogether the simple act of human arms raising and lowering copies of Shakespeare’s writings has expended well over 4 trillion Joules of energy. That’s equivalent to combusting several hundred thousand kilograms of coal.
And that’s just for the physical production of the actual Shakepearean canon. Add this:
Additional energy has been utilized every time a human has read some of those 835,997 words and had their neurons fire. Or spoken them to a rapt audience, or spent tens of millions of dollars to make a film of them, or turned on a TV to watch one of the plays performed, or driven to a Shakespeare festival. Or for that matter bought a tacky bust of “the immortal bard” and hauled it onto a mantelpiece. Add in the energy expenditure of the manufacture of paper, books, and their transport and the numbers only grow and grow.
I have a lot of Shakespeare in the house myself. In addition to the various printed editions of his works I’ve accumulated since grad school, I also keep the Oxford and the recent Modern Library editions on my ebook readers (a Kindle Oasis and a Kobo Aura One). I like to think I’m saving a few trees: Scharf points out that given US paper production statistics (based on 2006 data), 28,000 Joules of energy were used per gram of final material. US paper production ran to 99.5 million tons of pulp and paper that year.
Here again the question of why we keep things: I read a lot of library books downloaded onto my e-readers. Talking this over with a bookish friend, he told me that wouldn’t work for him. He had to have a physical object on his shelf that he owned. Why?
If you think of this in terms of symbiosis, we are creating a burden of energy use to feed our dataome that continues to grow, and it’s a reasonable question to ask whether we are drawing the kind of benefit from it that we might. What are all those Facebook posts worth? But unlike our situation with the ‘selfish gene,’ this human-dataome symbiosis is something we can manage, even if we haven’t really examined its evolution or analyzed its function in the overall growth of the species. I assume this is what Scharf will be doing in his new book, which I will be discussing here later.
In the more recent article, though, Scharf questions whether these concepts have value in how we deal with technosignatures and the ongoing expansion of SETI toward artifacts and technologies. I’ve often thought in terms of the Drake Equation that the L factor — the longevity of a technological civilization — is embedded in the question of whether technology actually offers an evolutionary advantage. In the short term, the answer seems obvious, but not if the inevitable outcome of burgeoning high tech is putting tools of species destruction in the hands of an ever larger number of people.
Scharf argues that a search for technosignatures can be considered more broadly a search for extraterrestrial dataomes, for the former grow out of the latter. He suggests that we consider something like a Dyson sphere as a consequence of a process that is itself Darwinian:
…the arrival of a dataome on a world represents an origin event. Just as the origin of biological life is, we presume, represented by the successful encoding of self-propagating, evolving information in a substrate of organic molecules. A dataome is the successful encoding of self-propagating, evolving information into a different substrate, and with a seemingly different spatial and temporal distribution— routing much of its function through a biological system like us. And like other major origin events it involves the wholesale restructuring of the planetary environment, from the utilization of energy to fundamental chemical changes in atmospheres or oceans.
This plays into our plans to examine planetary atmospheres for environmental factors that could be the consequences of these kinds of energy transformations. Thus it would behoove us to consider the relationship between the dataome we move in and the biological life — ourselves — that interacts with it, questioning in what ways the interests of the two are aligned and where they may be coming out of joint. (Sorry, I had to get Hamlet in there, what with all this talk about Shakespeare: “The time is out of joint. O cursed spite, that I was born to set it right!” The Bard is timeless).
I’m not sure how we examine a balance like this in Darwinian terms, but we wrestle daily with consequences like social networks changing discourse and affecting public policy, or the widespread propagation of cultural memes via cable and streaming TV. Scharf sees carbon emissions as one consequence of the dataome’s insatiable demand for energy, so industrial pollution is an inevitable offshoot. I think we need to ask whether the idea of a dataome can offer us anything predictive about what another species might do as it encodes and propagates its own information.
Scharf asks the question in these essays but it’s clear we are only at the beginning of what may be a long conversation. I’m having trouble seeing how parsing the growth of data this way gives us tools beyond the factors we’re already using to search for technosignatures, but the key may be in the idea that a dataome resembles a living rather than an inert system. If genes can be selfish, can data be the same? Just how much control do we have over a dataome when it reaches planetary dimensions?
As we ourselves don’t know the outcomes of such growth, its manifestations in a technosignature will be hard to imagine. Let’s see where Scharf goes with this next.