Building the Google of Blood, One Tube at a Time


The first shipment arrives at 4 A.M. The boxes are opened by laser—in case a hand should slip and plunge a knife into the tightly packed dry ice. Here, suspended in the thousands of test tubes that arrive each day, is an endless stream of blood. If one of the tubes is yours, you have probably run out of options—except for the last gamble of a clinical trial.
The Covance laboratory, in the leafy Glen Chapel neighborhood of Indianapolis, is the center of a global network of labs that, collectively, is estimated to process nearly fifty per cent of the blood from the world’s clinical trials of experimental treatments. Last year, six million individual blood samples passed through Indianapolis alone. That’s almost a billion data points collated, catalogued, and searched—the foundation of a kind of Google of blood, which could deliver personalized medical evaluations at the prick of a finger.
To avoid the possibility that an errant tug could spray a lab technician, a machine removes the test tube caps—a hundred and twenty thousand every week. Each sample is bar-coded and randomized, photographed and categorized, and then shunted through a miniature suspended railway on one of fifteen hundred little carriages, each tagged with a radio-frequency I.D. to track its journey.
Six Siemens C.B.C. (complete blood count) machines, installed in 2010 at the cost of two hundred and fifty thousand dollars each, start the process. They enumerate and categorize the white and red blood cells in each tube, indications of bacterial and viral infection, and other, more exotic and sinister flora of the human circulatory system—Döhle bodies, faggot cells, Auer rods, and May-Hegglin anomalies. (Unusual configurations are detected digitally and then inspected manually under a microscope.) Afterward, the blood’s analytical journey continues: liver function, kidney function, and electrolyte measurement; lymphocyte identification; and a pass through machines called Modulars. It is a Fantasia of clinical analysis.
“It was jaw-dropping,” Dimitris Agrafiotis, Covance’s chief data officer, said, recalling the first time he visited the Indianapolis lab. “I really wasn’t prepared to see anything on this scale. It’s basically a factory of clinical laboratory testing, a factory working to perfection.” Agrafiotis was used to thinking in very big terms. At Johnson & Johnson, he had developed computer techniques to build and mine huge virtual libraries of molecular data in the hunt for new compounds. In doing so, he was an exponent of what has come to be called “big data” analysis, long before it became the philosopher’s stone of the business world.
A decade ago, combinatorial chemistry and high-throughput screening fuelled a pharmacological arms race for new precision molecular weapons, but that has not yet produced an array of smarter drugs, Agrafiotis says. Nor has the sequencing of the human genome lifted the fog from the battlefield of disease, as had been promised. Newer technologies of measurement—microarrays and next-generation sequencing—may be capable of seeing staggering amounts of data, but figuring out how to analyze their results and model them dynamically is more difficult.
Last March, the Global Language Monitor announced that “big data” had retained the top spot on its ranking of the “tech buzzwords everyone uses but doesn’t quite understand.” Has the term been oversold? “I think our ability to collect vast amounts of quantitative physiological data on individuals over time marks the beginning of a transformation in medicine and health,” Kevin Hall, who runs a biological-modelling lab at the National Institute of Diabetes and Digestive and Kidney Diseases, said in an e-mail. “Unfortunately, the next step is much more difficult: making sense of the data.” Tim Church, professor of preventative medicine at the Pennington Biomedical Research Center, at Louisiana State University, agrees. “I believe it is going to be a much slower process than people think,” he said by e-mail. “Devices that generate data every second create very large data sets which can be quite difficult to clean, manage, and utilize. That is where we are now. We’ve got lots of data, but we’re not sure how to handle it.”
Agrafiotis, a forty-nine-year-old quantum chemist who arrived at Covance a year ago, is one of the emerging architects of this data-analytics world. He believes that there is a proportional relationship between how well an algorithm is designed, how well it performs, and how well the data it returns are visualized. He showed me a series of visualizations that his team has created about the researchers who conduct the clinical trials, using just the data collected by the lab. Who is good at recruiting patients quickly? Who is good at retaining patients long enough for a clinical trial to be effective? What are the best sites—down to the level of cities and individual hospitals—around the world for recruiting patients for clinical trials based on particular diseases? Who are the researchers who produce the cleanest data? Without an extensive database of clinical trials, these questions are difficult to ask, let alone answer.
But the Covance lab was able to decipher the answers with a series of deceptively simple visualizations. They identified super-researchers—those who are able to recruit and keep lots of patients and produce lots of clean data—and they identified who was subpar. “We now have the most reliable and precise way of capturing investigator performance imaginable,” Agrafiotis said. “The entire process allows us to select which sites are likely to recruit patients faster, keep them longer, and perform well, meaning they won’t have quality-control issues.” The best researchers can get a trial up and running in considerably less time than the industry average—in this case, thirty weeks instead of fifty-five, Agrafiotis said, pointing to a slide on his computer screen summarizing thousands of clinical trials. The pharmaceutical industry, which is increasingly beset by drug-development costs, could save hundreds of millions of dollars with such improvements in performance. “We are going to industrialize analytics, make them available on a massive scale, comprehensively, throughout the world—and at the speed of light,” Agrafiotis said. “That’s the job. It won’t happen in a month. But imagine the world!”
This is Big Pharma talking, to be sure, but the labs are built, the data are already there, and the need to make drug development less expensive and more personalized and effective is acute. To speed up research on three costly diseases, the National Institutes of Health have just launched a data-sharing program with the pharmaceutical and biotech industries.
It is hard, when you see humanity reduced to an endless line of test tubes, not to imagine what these people felt when the needle drew their blood. The laboratories of the world contain a vast aggregation of human suffering. The disembodiment of that suffering is humbling; so, too, is its digitization.
Photograph by Jason Butcher/Cultura/Getty.

0 comments :

Post a Comment

Cancel Reply

Popular Posts