Yale and Google Collaborate to Uncover New Cancer Treatment

An innovative collaboration between researchers at the Yale School of Medicine and Google has led to a promising new treatment approach for cancer. An artificial intelligence model, developed by the team, identified the drug Silmitasertib as a potential aid in helping the immune system locate and combat cancerous tumors. This discovery is notable because it emerged from AI analysis rather than existing scientific literature.

Professor David van Dijk, who leads the research at Yale, expressed surprise at the AI’s suggestion, as no prior research had indicated that Silmitasertib could function in this way. The model, termed Cell2Sentence, speculated that the drug could enhance antigen presentation, a crucial process that enables the immune system to recognize and fight cancer cells. Following this hypothesis, the lab conducted tests on human skin and pulmonary cells, validating the AI’s predictions.

In October 2024, the findings were published in a preprint paper, a result of collaboration with Google’s AI divisions, Google DeepMind and Google Research. Van Dijk remarked that this work marks a significant advancement in using large language models (LLMs) for biological predictions that can be experimentally verified.

The Technology Behind the Discovery

The research team at Yale has been exploring a novel way to analyze human cells for several years. By studying single-cell RNA sequences, which detail gene expression within individual cells, they aimed to decode the complexities of human biology at a granular level. Graduate student Syed Rizvi, one of the lead authors of the paper, noted that understanding genetic patterns can significantly aid in distinguishing between healthy and malignantly altered cells.

To tackle the challenges posed by the vast amounts of data in single-cell RNA sequences, the researchers initially employed natural language processing techniques. By transforming numerical data into sentence-like formats, the AI model was able to discern biological patterns effectively. For instance, a sequence representing cellular activity might be structured as “Gene A Gene B Gene C,” reflecting the activity of around 18,000 genes.

The early iterations of their model operated on GPT-2, an older AI model with approximately 774 million parameters. In contrast, newer models like GPT-4, released in March 2023, contain 1.76 trillion parameters, offering more sophisticated capabilities. Despite the limitations of the earlier model, the team achieved significant milestones, enabling accurate identification of cell types based on their ‘cell sentences.’

Collaboration with Google Enhances Research

In 2024, the partnership with Google began during an AI workshop at Yale, where the research team met with Google DeepMind scientists. This collaboration provided the Yale researchers access to Google’s extensive computing resources, reportedly the largest of any organization worldwide. This advantage allowed the team to transition from their original GPT-2 model to Google’s advanced AI model, Gemma-2, expanding Cell2Sentence to 27 billion parameters and enhancing its analytical capabilities.

This collaboration focused on improving the AI’s performance in predicting drug effects on human cells, a process referred to as “perturbation response prediction.” The researchers utilized the model to find drugs that could enhance immune signaling in the presence of disease-related proteins. The identification of Silmitasertib, which is typically used to inhibit cancer growth, was a key success of this research initiative.

The significance of this achievement was underscored by Google DeepMind scientist Shekoofeh Azizi, who remarked that the model’s ability to perform complex biological reasoning highlights the potential of large-scale AI systems in drug discovery.

Looking ahead, the researchers aim to leverage their findings to accelerate drug development in medicine. A report from 2024 estimated that the average cost to develop a new drug, including failures, is approximately $500 million. The hope is that AI can streamline the pre-clinical phase, guiding researchers toward experiments with the highest likelihood of success.

Van Dijk emphasized the importance of continuing to scale AI models to maximize their potential. He envisions a future where AI could simulate the entire human body, enabling rapid testing of various drugs without the ethical concerns associated with human trials. “Imagine a virtual human that can simulate everything biologically about real humans,” he said, “and now you can test all these drugs much faster.”

With this groundbreaking research, the collaboration between Yale and Google not only signifies a leap forward in understanding cancer treatment but also represents a broader potential for AI in the life sciences. The findings could pave the way for more efficient drug development processes, ultimately benefiting patients worldwide.