writing • AI x biology series
part 1: the digitisation thesis
why biology resisted. why that is ending.

the digitisation thesis

the digitisation thesis

every major information system that got digitised followed the same pattern.

the moment the underlying information became digital, the cost of copying it dropped to zero. intermediaries collapsed. value concentrated at whoever controlled the data infrastructure. music. publishing. finance. maps. each took less than a decade once the digitisation event happened.

biology resisted. for a specific reason.

biological information is not text or numbers. it is dynamic, context-dependent, multi-modal, and physically instantiated in living systems. you cannot copy a cell. you cannot download a disease mechanism. the search space was irreducibly physical, which meant the economics of every other information industry simply did not apply.

then AlphaFold happened.

most people read AlphaFold as an AI story. it is not. it is a digitisation story. for the first time, a critical layer of biological information, protein structure, became fully digital. sequences in, structures out. no wet lab required. fifty years of structural biology compressed into months not because the model was extraordinary, though it was, but because the search space stopped being physical.

the unlock was never the model. it was the search space going digital.

what is happening now is that digitisation is moving up the biological hierarchy simultaneously. the genome. the cell. the clinical signal. the trial itself. not sequentially. simultaneously. and that simultaneity is what makes this moment structurally different from every previous wave of technology in biology.

this is not primarily an AI story. AI is the tool. the search space going digital is the event.

the scale of what is becoming legible is worth stating plainly. the Alliance for Genomic Discovery, now ten members including Regeneron Genetics Center, has expanded its core dataset to 312,000 whole genomes paired with deep clinical data, and just announced 50,000 additional genomes with paired proteomic data. that is one consortium. the UK Biobank, All of Us, Biobank Japan, and Singapore's PRECISE are each generating comparable volumes simultaneously. the data being generated today dwarfs anything biology has ever had access to, and almost none of it was connectable to AI systems five years ago.

when music digitised, the labels did not lose because their artists got worse. they lost because the infrastructure of distribution became irrelevant overnight. the value did not disappear. it moved. to whoever owned the new infrastructure layer.

biology is a larger information system than music by several orders of magnitude. the data it generates, genomic, proteomic, cellular, clinical, is more than anything that has digitised before. and almost none of it is connected, searchable, or legible to the systems that could extract value from it.

that is changing. not in one place. across the entire stack at once.

the genome is now a $100 read. the regulatory DNA that controls gene expression, the 98% of the genome that AlphaFold never touched, is becoming interpretable for the first time. the cell is being modelled not just observed. the clinical trial is being redesigned from a sequential batch process into a continuous adaptive system.

each of these is a digitisation event in its own right. happening simultaneously is not coincidence. it is what happens when compute, sequencing costs, and biological data generation all cross critical thresholds at the same time.

the question this series is built around is simple: when the largest information system that was never digitised finally digitises, who captures the value, where does it concentrate, and what does it mean for how drugs get discovered, validated, and brought to patients?

the answer is not obvious. and it is not what most of the current narrative suggests.

the next three parts examine what the history of pharma productivity tells us about whether this wave is actually different, what is real versus positioning in the current moment, and where we think value accrues, including the opportunity every Western player is systematically missing.

/article