Microsoft Wants to Diagnose Disease By Building Massive Database of the Human Immune System

Imagine a spreadsheet of every meal you’ve ever eaten, every hand you’ve ever shook, every bit of dust that’s ever gotten in your eye—and multiply it by about a million times. That gives a sense of the size of the data problem that is your immune system.
Through a new AI project, Microsoft hopes to solve this data problem and make diagnosing nearly any disease as simple as a single blood test.
You see, stored within your immune system is a record of virtually every threat to your health that you’ve ever encountered. When an invader shows up—be it the flu, cancer, or something weird you picked up while showering without flip-flops at the gym—your body identifies it and launches a targeted attack. This works the the help of special cells called T-cells, which each carry a corresponding surface protein called a T-cell receptor with a genetic code designed to target a specific disease, signalled by what’s called an antigen.
So if the immune system’s T-cells each contain genetic markers of every pathogen the body has encountered, then decrypting those markers could theoretically give you a log of every threat you have ever faced. That’s what Microsoft is hoping. In a new research effort with the Seattle biotech firm Adaptive, the company is working to decode the human immune system so that it can diagnose disease.
“Your immune system should know what you have before your doctor does,” said Adaptive CEO Chad Robins at the annual JP Morgan Healthcare Conference in San Francisco on Wednesday.
The idea is, in essence, to make a map of the human body’s immune responses—of its T-cell receptors sequences and the codes of the antigens they have fought against. And using that map, eventually, the idea is to be able to diagnose practically any disease from a sample of blood.
Remember the massive spreadsheet we imagined earlier? That spreadsheet is the reason this problem calls for artificial intelligence.
“We’re searching for patterns in a giant space,” Peter Lee, vice president of AI Research for Microsoft, told Gizmodo. “In machine-learning, a problem this big is exotic.”
Your body is constantly coming into contact with foreign invaders and having immune reactions to them. Because many T-cell receptors bind to different antigens, the presence of one T-cell receptor could indicate a host of different diseases. That’s a lot of complicated data to crunch. The information is all there, but right now, we just can’t read it.
To start sifting through the trove of immune data, Microsoft and Adaptive need to sequence a whole lot of T-cell receptors, in order to train the machine-learning algorithms. That means that a universal diagnostic tool for disease is a long way off. But in the shorter term, they could come up with new ways to diagnose specific diseases.
“Sequencing your entire adaptive immune system is a huge data problem,” Robins told Gizmodo. “It’s like a giant jigsaw puzzle. It’s mostly a matter of doing the grunt work.”
The project is part of Microsoft’s recent efforts to double-down on using AI to attack health care problems.
Robins said their first efforts will focus on hard-to-diagnose autoimmune disorders and infectious diseases, as well as high-risk cancers. Lyme disease, for example, he said at the conference, might be a good early candidate. The first diagnostic tools could be as soon as three years away.
“If we layer on diagnostic after diagnostic, ultimately we can get this to be a screen for the entire system,” Robins said. “We believe in not having this be some mega-moonshot.”
Lee, whose background is in computer science, not biology, told Gizmodo that the problem is similar to that of machine translation.
“We’re not able to read the antigens directly, but we can read the T-cell receptors to get this weird translation that your body has made of the disease-state,” he said. “At a fundamental, algorithmic level, that’s very similar to problems we’ve already solved.”
As is always the case with the human body, though, there are still a fair number of unknowns. For instance, the operating assumption is that the signals for diseases currently invading the body will be stronger, and stand out against signals from things your immune system has fought off in the past. But that’s still mostly just a theory. Matching one of many thousands of T-cell receptors to the correct antigen with enough accuracy to diagnose a disease is likely to be hugely complicated.
But the payoff could be huge. A universal map of the human immune system could enable doctors to diagnose patients earlier on, and more accurately—and without putting them through an endless array of tests. Access to such granular detail about a person’s immune system could also help predict how they might respond to new pathogens and treatments in the future.