How do you use a virtual cell to do something useful?

2 hours ago 1

Note: I haven’t posted here in a bit, but I have been writing! Over the last month, on my companies blog (you should follow if you’re interested in cancer!), I have been putting together a series of case studies as to how one can use ‘virtual cells’ (specifically, the ones that we’ve built) for actual, grounded, real, genuine, and overwhelmingly important problems in cancer drug development. This was fun! But, more importantly, it seemed to be helpful for people who have no idea what the point of that research field is.

Here, I present those three articles, compiled into one, for your reading pleasure.

I won’t always cross-post articles, but I am doing it here purely because it’s a set of essays I’ve wanted to write for months and I think everything in it is really cool. If you think this is all a bit much to read, I’d recommend just reading just the ‘virtual perturbations’ section, that’s my favorite. If you’d like to chat further about this work, feel free to reach out to me!

Finally: if you’ve already seen these, apologies for the inbox intrusion. Starting next week, Owl Posting will return back to weekly essays.

A lot of people have been very interested in ‘virtual cells’ lately. An exact definition is difficult to find, but one offered by a recent Cell perspective paper is the following:

Our view of [a virtual cell] is a learned simulator of cells and cellular systems under varying conditions and changing contexts, such as differentiation states, perturbations, disease states, stochastic fluctuations, and environmental conditions. In this context, a virtual cell should integrate broad knowledge across cell biology. [Virtual cells] must work across biological scales, over time, and across data modalities and should help reveal the programming language of cellular systems and provide an interface to use it for engineering purposes.

It’s an exciting idea! A computational simulation of a cell should be, theoretically, exceedingly useful for all sorts of clinical and preclinical research, by virtue of being able to eschew expensive wet-lab efforts in favor of cheaper (and potentially more reliable) GPU time. So it is no surprise that a great deal of research is already being actively done in this area. Elliot Hershberg, a venture capitalist at Amplify Partners, recently compiled a small summary of ongoing work here:

But as with every promised revolution in the life sciences, the revolution will hesitantly admit some nuances upon questioning.

Of highest concern is the fact that nearly all virtual cell model efforts being worked on are not virtual cells of human biology, but rather cancer cell lines, which—while convenient, well-characterized, and infinitely malleable—are far from the true physiological complexity of healthy or diseased human tissue. Due to this, figuring out how their insights extend into assisting with the drug development process is usually another hard problem in and of itself. But, to be clear, this doesn’t mean they aren’t useful. Biological research being done on cancer cell lines is a common phenomenon at the preclinical research stage, which is what nearly all virtual cell models are currently geared towards assisting.

This partially answers the question why, despite how exciting ‘virtual cells’ seem, there are very few, clear-cut examples of how such methods will be ultimately used. That vagueness is partly built into the reality of early-stage biology, so it’ll be years before the ultimate impact of this line of research is felt.

But one area of virtual cells that could have a concrete value-add in the immediate short-term is the deployment of them at the clinical stage of drug development. After all, this is where the real bottlenecks lie: trials are slow, expensive, and fraught with uncertainty, and even small improvements here can ripple into huge downstream gains. Of course, while the opportunity here is massive, the downside of touching this area is that it is hard to do. Very, very hard. As a result, there is almost no virtual cell effort meant to operate at the clinical stages of drug development, even though the translation problem there is, theoretically, ‘easy’.

Other than us. Noetik is building virtual cells with the explicit goal of assisting with clinical-stage problems: identifying responders to drugs and refining patient inclusion criteria for trials. At the same time, we believe that the tools we create in this process will also have powerful applications in pivotal, high-risk areas of preclinical research, such as target selection, while remaining grounded in human-level data. All three will be discussed in this essay series.

How do we do this? Our view is simple: the shortest path to usefulness is not maximal simulation on unrealistic biology, but grounded observations into realistic biology. We built that foundation first. Every datapoint that trains our virtual cell models comes from human tumor resections: 77M cells across ~2,500 patients across a dozen+ cancers, with paired spatial transcriptomics, spatial proteomics, exomes, and H&E’s from each one collected in our lab. In total, this is easily one of the largest datasets of its kind. And not a single cell line. We strongly believe that this means the path from in-silico workflows to something clearly translatable is far more direct: human to human, rather than detouring through unrealistic animal or cell models.

That difference matters! In cancer, translation is the bottleneck. Drugs fail, not because they don’t work in preclinical settings, but because they don’t work in real human patients.

Using this human-derived tumor data, one of the virtual cell models we’ve created is ‘OCTO-VC’. This model is entirely trained on 1000-plex spatial transcriptomes, and its core task is deliberately prosaic: given the transcriptome of a few neighboring cells, reconstruct the “center cell” transcriptome—over every cell, in every tumor, for every patient. We released a (very long) post late last year discussing it in depth for those who are curious about the machine-learning details, alongside an online demo.

But what wasn’t discussed in that earlier post is how one can use models like this for clinically meaningful, non-trivial problems.

In this essay series, we hope to do exactly that, by showing three case studies of times where OCTO-VC was directly useful for our therapeutics team.

Therapeutic Context:

One of the most common (and effective) therapies in cancer are anti-PD-1 drugs. The underlying biology is straightforward: many tumor cells express PD-L1 on their surface, which binds PD-1 receptors on T cells to dampen T-cell activity. Anti-PD-1 (or anti-PD-L1) antibodies block this inhibitory interaction, allowing T cells to attack the tumor. But not all cancers rely on this pathway. Some tumors have little to no PD-L1 expression, meaning that drugs operating on that mechanism would, in principle, have limited effect. This has led to a common clinical rule of thumb: patients are considered potential candidates for anti-PD-1/PD-L1 therapy if ≥1% of their tumor cells express PD-L1, or are PD-L1+.

But this still isn’t perfect. Even with this inclusion metric, roughly half of patients still do not respond to this therapy, even if they are PD-L1+, and it is unclear why.

Question:

Can OCTO-VC improve how well we can identify responders to this anti-PD1 drugs?

What we found:

Seeing how well OCTO-VC can help us here is quite straightforward: create a high-dimensional embedding of each of our tumor cores that we have responder data for, and see if the embeddings of responders differ from those of non-responders. And, most importantly, is the clustering better than a good baseline, as in, the usual patient inclusion criteria?

You may be instinctively surprised by the fact that OCTO-VC’s value here doesn’t come from the usual virtual cell trick of simulating perturbations, but instead from the far simpler act of representation. But this is, in fact, the most reliable way to rely on models like this; it allows our underlying, extremely rich data to ‘speak for itself’ without needing human intervention.

Using a small cohort of patients—15 responders and 24 non-responders, with both groups meeting the “ideal candidate” criterion above, or PD-L1 tumor proportion score ≥1%—we generated an embedding for each of their tumors using OCTO-VC. The below graph shows the embeddings for all our core samples, reduced to two dimensions via PCA. The ones we have PD-1 responder data for are colored in either green or magenta.

The responders seem to mostly be in the lower right quadrant, so there’s meaningful separation in the entirely unsupervised embedding. And, training a basic model on the PCA reduction allows us to quantify the signal, showing that predictions match up well with the response cluster, and that it is above chance. Here is the associated confusion matrix of the trained model:

Remember, we’re working off a pre-selected patient population here. If “1% of tumor cells expressing PD-L1” were a really good biomarker with no real room left to improve upon, we wouldn’t be able to further subdivide the likely-to-respond patient population any further. The fact that we’re able to easily spot the “response cluster” in the embedding space is encouraging to us, and implies that OCTO-VC is capturing response-relevant biology that the 1% rule misses.

Future Directions:

The cancer field has been through a lot of definitions on what is the ‘most’ important factor to care about regarding a tumor. At first, it was about histology. Lung cancer could be separated into small-cell, squamous, non-small-cell, and so on. Then arrived the genetic era, when EGFR mutations or ALK fusions could, by themselves, dictate treatment. Now we are in the protein marker era, with PD-L1 expression being the most commonly deployed stratifier for checkpoint blockade.

None of these were wrong, but each was only capturing a small fragment of the whole.

Tumors are not uniform entities, but rather shifting ecosystems of cells, pathways, and immune structures. Understanding this complexity, to a large extent, may be beyond human intuition or comprehension. We at Noetik strongly believe that machine intelligence is the only way to grasp the tumor microenvironment in full.

The building of OCTO-VC, and the fact that anti-PD1 responders are so clearly separated in its embedding space, implies to us that this conviction is directionally accurate. Also, since the underlying data is all sourced from human tumors, we can easily pin down what other biological features predicted anti-PD-1 responders correlate with, both to reassure ourselves that they make sense and that they can be converted to usable assays. And indeed, both are present: CD8 infiltration, high interferon gamma levels, and antigen presentation markers (to name a few) align with responder status.

But the real significance here is not in what we can do for anti-PD1 therapies—many people have worked on this exact subject before—but rather, in how easily our methodology can be extended to any arbitrary cancer drug. In other words, if OCTO-VC can isolate a subset of checkpoint responders from within an already enriched PD-L1+ population, then it should also be able to refine other trial cohorts. Our first partnership is with Agenus, a public biotechnology company, to see if our model is capable of accurately distinguishing between responders/non-responders from a recent clinical trial that Agenus ran. We’re looking forward to reporting what our results are here!

Of course, one won’t always have response data. We think there are useful ways that OCTO-VC can be used in those situations as well, which is something we’ll discuss next, covering how our model can be used for expanding eligibility in clinical trial design even when lacking access to true response/non-response data.

Therapeutic Context:

In the last section, we talked about how tumor complexity is beyond any human to genuinely grasp, and how we arrived at an understanding of anti–PD-1 response through model embeddings. But in cases when explicit response labels aren’t available, the challenge then shifts. We cannot ask whether responders and non-responders separate. Instead, we must reason through proxy; using machine-learned patterns that recur across tumors that closely correlate with suspected mechanisms of action (MoA) of the drug being studied.

Some background information: as of today, most patient inclusion criteria in cancer clinical trials rely on disturbingly coarse, overly-reductionistic patient inclusion criteria: % of PD-L1 expression across a tumor biopsy (like the previous case study), whether the histology says a tumor is “triple-negative,” or if sequencing shows the presence of a particular mutation. But even when the cancer field explores the value of more complex markers — which oncologists clearly recognize as important! — the published signals are nearly always fragmentary, with a single local motif, and rarely grasp the full neighborhood or architectural context of a tumor microenvironment.

Why is this? Why aren’t markers more complex? Much of it comes down to the fact that the logistics of designing and validating even mildly complex assays are essentially intractable. Every hypothesis requires years of prospective planning, the right tissue samples, and the ability to multiplex the correct set of markers from the start. If an important signal is missed, the entire study has to be restarted. This is to say nothing of biomarkers that are ML-enabled, operating across dozens or hundreds of axes at once, which is virtually never explored.

As a consequence, trial sponsors are forced into the simplest, most reductive criteria, not because they believe those are the best biology, but because they are the only practical levers available within trial timelines.

Question:

One of the things we’ve been most excited about is using OCTO-VC to take previously impractical hypotheses for drug response prediction, and test them out at scale.

The question here requires some extra context and, because we’re actively exploring it, some obfuscation.

Last year, the FDA halted a late-stage cancer clinical trial run by a large biopharma, not because efficacy wasn’t observed, but because, midway through the trial, efficacy was observed only in ‘Subgroup Z’. As a result, this forced the biopharma to submit a protocol amendment to restrict follow-up trials to only be on the Subgroup Z cohort. This is quite a blow to them, since that cohort is a fair bit smaller!

But as is typical in cancer trials, patients wind up in Subgroup Z due to an extremely coarse biomarker. Theoretically, the drug that was part of this trial is over a well-trodden target, so there should be a much better way to separate out the ideal patient population. Unfortunately, like we mentioned, doing any sort of large-scale biomarker study would normally require an enormous multi-year biomarker program—prospectively designing assays, collecting new tissue samples, and validating them across multiple sites. That’s the standard, slow way.

With OCTO-VC, we can invert the order of operations. Instead of starting with a hypothesis, locking in the markers, and then waiting years for data to trickle back, we start with the existing atlas of human tumors and ideate on new ways to separate out responder/non-responder patients.

So, our question is: can OCTO-VC come up with better stratification criteria for selecting responders?

What we found:

First, a basic sanity check was met. Subgroup Z cohort is quite distinct in our embedding space; in the graph below, it is the small, bright yellow-green segment on the left.

And we knew that that yellow-green segment was filled with responders to the drug. So an easy question to ask OCTO-VC is: what other, more complex marker overlaps with that segment and has a mechanistic rationale for the overlap? After some iterative searching, our therapeutics team found a strong signal: a particular ‘tumor microenvironment concept’ that seems highly enriched in Subgroup Z, but also extends outside of it. While we won’t expand on what the concept is, we believe that it is unlikely to be noise given how biologically relevant it is to the therapy in question.

Here, that concept is shown in the same embedding plot through color; high meaning ‘highly enriched for that concept’:

Circled is the true-response cohort, which is high in that concept. But you can notice, slicing through the region outside of Subgroup Z, another large pocket of people high in that concept.

In other words, we believe this biopharma has set overly conservative inclusion criteria. By doing so, they not only leave billions of dollars in potential revenue untapped but, more importantly, will leave an immense number of patients without access to a therapy that has a clear mechanistic reason for meaningfully improving, or even saving, their lives.

Future Directions:

One striking aspect of the OCTO-VC’s embedding space—something that continues to surprise even us—is how clearly it aligns with therapeutic problems, despite having had absolutely no access to perturbational or labeled data. After all, OCTO-VC could separate higher-order cancer definitions (e.g., Subgroup Z vs. not-Subgroup-Z) directly from human tissue, and, with some human judgement, was able to surface MoA-relevant subtypes within them; ones that would be far too costly to ever pull out in the real world. And this phenomenon of ‘clinically meaningful organization’ seems to reoccur across the embedding space!

As an example, basic Leiden clustering of an OCTO-VC embedding space (the same one discussed in this section, but different from the previous sections PD-1 embedding space) demonstrates tissue-level characteristics that align with therapeutic MoA’s. Annotations here are provided by humans:

How is this possible? How can the model, without explicit supervision, uncover patterns that map so directly onto biological mechanisms and therapeutic relevance?

One argument is that cancer is a particularly special disease, extremely well-suited to self-supervision tasks. Unlike many other therapeutic areas, oncology has historically been driven by mechanism-based stratification; cancer drugs are often developed and approved not for a broad, undifferentiated population, but for genetically or phenotypically defined subgroups. As a result, the very axes that determine drug response are the same ones that structure our human, tissue-level data. And machine learning is very, very good at dissolving complex, high-dimensional data into those underlying axes.

Of course, turning this work, and others like it, from an analysis into an actual regulatory argument is often another challenge in of itself for many virtual cell efforts. A model can suggest that patients with a certain microenvironmental signature are likely to respond, but to satisfy regulators, that suggestion has to be translated into a practical assay. But this is the core of what makes “virtual cells” particularly useful if they are derived from human data, and not cancer cell lines: this translation is straightforward.

After all, the signatures that OCTO-VC surfaces always have a direct connection to real human tumors. The signatures are often intricate, something that would require years of effort and millions of dollars to define through traditional approaches, but still can be boiled down to a set of measurable markers, morphologies, or local interactions if needed. As an example, the tumor microenvironment concept we discussed above is something that is very amenable to being turned into an assay.

We strongly believe that this ability to create these complex definitions of responder cohort—given only hours of GPU time—can not only expand patient cohorts (as we’ve discussed here), but also rescue otherwise unpromising drugs and open entirely new therapeutic opportunities that were previously invisible under traditional stratification methods.

Finally, in the next section, we’ll discuss how even though OCTO-VC is most useful for clinical-stage problems, like patient selection, the same human-tissue-grounded programs also help prioritize targets, without changing the core principle: grounding every insight in real human tissue.

Therapeutic Context:

Two particularly common lung cancer mutations you’ll often see people discussing are KRAS and STK11. KRAS is one of the most frequent oncogenic drivers (i.e. causes the cancer in the first place), whereas STK11 is a tumor suppressor gene whose inactivation disrupts cellular metabolism and immune signaling. And, while KRAS-mutant tumors are quite common, STK11 shows up alone less frequently, more-so appearing alongside KRAS.

Tumors with this genetic combination are often referred to, unsurprisingly, as ‘KRAS STK11’. And, when the two mutations do appear together, the combination produces a particularly aggressive biology: tumors that are metabolically rewired, immunologically “cold,” and broadly resistant to both standard chemotherapies and immune checkpoint blockade. As expected, the clinical data consistently show the impact of this on patients: significantly shorter lifespans.

As of today, there are no approved therapies that directly address the KRAS STK11 genotype. Patients are typically treated with the same immunotherapy regimens offered to the broader non-small cell lung cancer population: immune checkpoint blockades. While this often works fine in KRAS patients, the efficacy of this class of drugs is far worse for the KRAS STK11 patients. And, given that the latter group isn’t particularly rare, millions of patients are likely underserved.

Question:

Which therapeutic targets, if targeted, would help cancer patients with KRAS STK11 mutations?

What we found:

Well, perhaps we should first ask a simpler question: what exactly is the fundamental difference between KRAS and KRAS STK11 patients in cell-types most relevant to immunotherapy? KRAS patients, after all, respond well to immunotherapy, so they could be considered a model population for understanding what “good” looks like in terms of immune biology. Afterwards, we can move onto assessing what targets are most relevant to shifting KRAS STK11 tumors to have that particular phenotype.

For both of these, we leaned heavily on OCTO-VC’s ability to simulate cellular states.

First, to assess differences between the two population genotypes, we set up a ‘virtual CD8⁺ T cell simulation’. Here, we asked OCTO-VC to predict the “expected”, or virtual, CD8⁺ T cell in the genetic and microenvironmental context of each patient's tumor. And what we found is that one of the strongest differences in gene expression between KRAS and KRAS STK11 patients were a class of genes called granzymes, specifically GZMA and GZMK, which are known to be a practical readout of ‘CD8⁺ T-cell effector function’, the capacity for a T cell to kill cancer via cytotoxic mechanisms.

In the below plot, Gene A is GZMA and Gene B is GZMK. We’ll discuss in the next section why we believe these virtual cell predictions are a much better way to assess patient-level differences compared to the raw transcript values, but for now, we’ll move on.

Step one completed, we’ve identified a therapeutically relevant difference between the two genotypes. Importantly, the marker does meet some sanity checks too. Granzyme expression has shown strong associations with response to PD-1/PD-L1 therapy in human tissue, clearly indicating that it is clinically meaningful for immunotherapy. So, one particular axis of improving the prospects of KRAS STK11 patients could be to simply find some way to increase granzyme levels.

But understanding the best ways to do this has been far from straightforward. Cytokine stimulation or blocking checkpoint molecules like TIGIT have all been shown in preclinical animal models to boost granzyme expression. Yet the current translational record is mixed: interventions that should theoretically raise granzyme levels often fail to yield durable tumor clearance in human clinical trials.

What’s going on here? Are granzymes the wrong lever to pull?

Perhaps, but there’s some reason to believe that some of the previous attempts to increase granzymes (in humans) did not, in fact, actually increase granzymes. After all, the molecular impact of at least one of those attempts seems to rely on entirely different mechanisms of action, ones that, empirically, ended up having no real patient benefit. The fundamental problem here may not be that granzymes aren’t worth modulating in humans, but rather, the targets that modulate them depend on the species. In other words, if you study mice only, you’re going to arrive at the wrong target.

After all, the structures of granzymes substantially differ between humans and mice. Broader than this is that the fact that immunity is a very, very species-specific topic. Consider inflammation, a close relative to our subject, and what a 2013 PNAS paper has to say about the role of mouse studies here (bolding added by me):

Murine models have been extensively used in recent decades to identify and test drug candidates for subsequent human trials. However, few of these human trials have shown success. The success rate is even worse for those trials in the field of inflammation, a condition present in many human diseases. To date, there have been nearly 150 clinical trials testing candidate agents intended to block the inflammatory response in critically ill patients, and every one of these trials failed.

All this to say: if we want to modulate granzymes, and come to useful conclusions about how to do so, we should work directly with human data. One way to do this (perhaps the only way to do it!) is to rely on OCTO-VC’s ability to perform virtual perturbations in real human data.

With the same computational framework as before—asking the model to predict virtual CD8⁺ T cells in a specific tumor microenvironment—we added one more step: knocking out a single gene across the tumor. From there, we ask how the virtual CD8⁺ T cell’s transcriptome would shift in response to that, comparing it to the baseline expected transcriptome of that cell type. We can do this systematically across thousands of genes to run a virtual screen. The knockout serves as a proxy for a drug, and the predicted impact on the virtual CD8⁺ T cell serves as a proxy for patient response. Of course, this impact is not at all guaranteed to be causal, merely strongly correlated and conditional on the spatial environment, but it can lead to useful hints.

We did exactly this perturbation across our KRAS STK11 patient cohort, searching for targets that consistently increased one of the granzymes, GZMA, expression in CD8⁺ T cells in real tumors. The virtual screen produced a clear signal: the top-scoring hit (Gene 20) was an adhesion protein, which we’ll call Target A.

Target A is particularly intriguing because a study published only a few years ago showed that inhibiting this target (in co-cultured human tumors with T cells) leads to increased T cell expression of a granzyme. One nuance is that that papers granzyme studied GZMB, not GZMA, but the two can be quite correlated. But most compelling of all, beyond in-vitro results, is that there are two human cancer trials that have tested drugs meant to inhibit Target A!

How have these trials gone? It’s a mixed bag: patients responded decently in one trial, but not in the other. But both of them are using the exact same inclusion criteria: elevated levels of Target A. We strongly believe that this may have hurt both of the trial readouts.

Remember, inhibiting Target A in KRAS tumors is unlikely to yield immense benefits, since we suspect the primary mechanism-of-action of Target A is in in increasing granzyme activity, and those tumors already harbor abundant granzyme activity. In contrast, KRAS STK11 tumors, which have depressed granzyme levels, stand to gain the most from Target A inhibition. So, by enrolling patients purely on the basis of ‘high Target A expression’, the trials were almost certainly accidentally enriched for KRAS patients—by virtue of KRAS being found in 30% of all cancers, while KRAS STK11 are found in 10% of all cancers— inadvertently selecting a patient demographic least likely to respond to the drug.

Both of the trials, in other words, potentially stacked the deck against themselves. The correct strategy would have been to include KRAS STK11 status in the inclusion criteria, thereby focusing on patients with the greatest mechanistic rationale for benefit. But the trials did not do this and, as a result, the final efficacy readouts of the drug may be worse than it could’ve been.

Future Directions:

In one fell swoop, our virtual cell model uncovered not only a therapeutically relevant target, but also inclusion criteria on what patients it is most relevant for. Though there are already ongoing trials for this particular target, we strongly believe that the correct inclusion criteria for it are not being used.

Is there a principled way this could’ve been done without OCTO-VC?

For finding the granzyme difference between the two genotypes, theoretically yes, but practically no.

For finding target A, neither practically nor theoretically.

One, on the granzyme difference: though granzymes are known to be markers of CD8⁺ T cell effector function, their modulation in genetic subcontexts like KRAS STK11 has not, as far as we can tell, been systematically mapped. But even if you had collected the same spatial transcriptomics dataset we had, discovering this relationship without OCTO-VC would’ve been challenging. Why? Because raw transcript values are, generally speaking, untrustworthy. To assess the transcriptional differences between two cohorts based on raw genes, you would need CD8⁺ T cells to actually be present in sufficient numbers within the tumor microenvironment and ensure that those cells were captured with sufficiently-high resolution.

This is rarely the case! In most of our samples, even correctly tagging a cell as being a CD8⁺ T cell is difficult, to say nothing of their transcripts, which are often sparse, heterogeneous, and noisy, making it difficult to detect consistent patterns. Virtual cells, produced by OCTO-VC, solve this bottleneck by being able to reconstruct what a CD8⁺ T cell state would look like in that genetic and microenvironmental context; conditioned on the spatial transcriptomic environments the model has observed across millions of cells.

And two, on finding target A: even if you could extract a clean signal from the raw data, discovering targets that modulate the granzyme phenotype further would be largely intractable. The typical way people would study this further is via animal studies, and, as we’ve mentioned, there is a massive gulf between what mice immune systems tell you and what human immune systems tell you. The only way to reliably explore the area is via screening targets in a human, in vivo context, which necessitates the usage of virtual cell models like OCTO-VC to do it at any reasonable scale.

And though Target A was discovered without OCTO-VC, its discovery relied on cell culture data. The results of this coincidentally translated to humans, but, given how often cancer drugs fail, it’s a very expensive coin flip to make and not something we consider particularly principled.

These results are, to put it lightly, exciting. The history of cancer drug development has shown us time and time again that translation is the bottleneck. The problem has always been that what works in a mouse, or in a dish, rarely works in a patient. That’s it. Fixing this is how we make a dent in stopping the millions of lives that are lost to cancer every year. And we fix it by not being able to predict the results of a functional assay, or cell-line experiment, or mouse experiment. We fix it by trying to predict what happens when a human being with a real tumor gets treated. That’s the only question that matters. Everything else is a proxy, a bad proxy, one that has led to 90%+ of all cancer drugs failing during clinical trials.

We are not the first ones to claim that predictions like that are possible, but we believe that we are one of the first to show concrete evidence of it actually being done. And remember, the results we have today are the worst ones we’ll ever have. Each day, the practical utility of the model that fueled these results gets better and better, both as its underlying dataset grows richer and our understanding of how to best deploy it is refined.

The trajectory to us feels obvious; in time, models like OCTO-VC will become routine parts of how oncology as a field functions. In such a world, patients don’t waste precious time on ineffective treatments, entirely new targets that once seemed unworkable become viable options, and trial populations are enriched for the responders who stand to benefit the most. We have strong conviction that not only is this world possible, but that it is already beginning to emerge.

If any of this seems interesting, please reach out to [email protected] or me directly to chat further.

And, if you’re curious to understand more ML-specific technical details about how the virtual CD8⁺ T cell’s actually work, we have an older post that discusses exactly that.

Thank you for reading!