Biotech's new superstructure: Vivodyne's bid to turn human tissues into scalable, searchable data

Table of Contents

💡

Part of our CEO feature series for The Onyx Life Sciences Report, published in Fortune.

Could you start by summarizing your career path and what led you to found Vivodyne?

My background is in the overlap of bioengineering and robotics. The biggest driver for founding Vivodyne really emerged from my graduate work, where I started realizing that our knowledge of physiology, biology and the underlying causes of disease remain largely unknown and difficult to probe.

There’s just so much that is unknown about how biology works. You have all these questions - why certain things develop in disease? Why do drugs work in some people and not in others? - and the answers just aren’t there. It’s a really frustrating experience when you’re sitting there thinking, “Someone must know this,” and they don’t.

Meanwhile, the need for those answers is extremely pressing in pharma. So, for me, the drive became: how can we create a fundamentally data-tailored approach to getting human data before clinical trials?

We want data at a scale that is much greater than clinical trials can ever give us. That data must be human, robust, and deeply complex. So, the question became: what do we actually have to build to get there?

That line of thinking - how do we answer those questions, instead of just being frustrated by them? - is what spawned Vivodyne.

What technical hurdles are you trying to overcome when you’re solving those problems but also trying to scale?

I would say the most fundamental hurdle is that, in this domain of gathering complex data, there’s seemingly always a trade-off.

The common belief is that you either have throughput or you have complexity. You either have reproducibility or you have realism. You either choose biological depth or breadth. It’s like there are all these sliders, and the assumption is you, almost by definition, have to compromise.

But there’s no law of physics that says that has to be true, right? I think this is the perfect example where you look at it and say, “This is just an engineering problem. If we think about this hard enough, we can come up with a solution.”

So our no-compromises requirements list looks something like this:

First, we need to grow human tissues - with hundreds of thousands of cells each - to really capture the complexity of our biology. We need to have all of the different cell types that are actively involved in the functions of those tissues in the human body. Those tissues need to be lifelike, perfusable, and immunocompetent. They need to have all the features of human tissue.

Secondly, we need to grow these tissues in a very reliable and robust way. If you don’t have reproducibility, the data is basically useless.

Thirdly, we need scale. It can’t be an ‘arts and crafts’ project at the lab bench. I say that because I’ve literally been at the bench thinking, “I’m doing adult ‘arts and crafts’, right now.” That doesn’t scale.

And fourthly, we need to be able to instrument and pull data from these tissues at a depth that actually matches their biological complexity. If you don’t have that, then having all this complexity is kind of pointless, because you can’t mine it.

So, basically, we need to grow human organs at massive scale; they need to be very reproducible; and the whole thing has to be scalable in a way that doesn’t just mean “hire more wet lab scientists and have them pipette.”

We realized that the most robust way to do that is to actually use the cells themselves to self-assemble into those tissues. With bioengineering, you can imagine an empty scaffold and say: “What if I just had hundreds of thousands of little robots that I could put into that structure, and they, somehow, would already know the instructions for how to make the tissue I want? And they know all the complexity of things that we don’t even know yet.”

Cells are those ‘robots.’ They already know how to keep us alive and functional. So, the question at Vivodyne became: how can we induce self-assembly of macro-sized tissues - large, biopsy-sized tissues - in a robust way?

We created an approach that merges some of the self-assembly thinking from organoids, with the large-tissue patterning from organs-on-chips, but on an even larger physical scale. It’s an approach that works across most organs and yields tissues with blood vessel networks that are perfusable and that have actual parenchymal organ function, stromal cells, and immune cells self-assembled into this realistic form in a very reliable way.

We then scale this up through microfluidics, and, around that, we build a fully automated robotics platform. We don’t want scalability to be limited by scientists coming in and pipetting. We grow the tissues to maturity, treat and perturb them, do functional genomics - knockouts, gene activation - in them, and dose them with combinations of different drugs under complex, multi-stage regimens, before pulling all that data out, all robotically.

It has to be completely automated, end-to-end. With studies that are testing on hundreds of thousands of individual tissues, there’s no way even a big team of people could manually keep track of all of that. So, to scale to that degree, it has to be fully automation- and robotics-driven.

We wrap this tissue platform in an automation hardware layer, but the robot doesn’t just move on its own. It needs something that tells it what to do and schedules the millions of actions that it has to perform to grow, then dose, then read out data from those tissues autonomously for experiments that have to run unattended for weeks.

That’s where our software layer comes in. It translates unambiguous experimental intent - what we want to test in a study - into this very long, sequential schedule of actions that the robots have to perform. That has to be generated automatically; otherwise, you’d end up spending more time programming than you would just doing it by hand.

And then, wrapped around all of that is an AI-in-the-loop experiment design and data analysis layer, which feeds in the fundamental input for what we should test next. It proposes the most informative experimental designs - the ones that yield the greatest marginal benefit, that most unmask any “fog of war” from the data.

So, effectively, this gives us a system in which AI picks the next experiments; automatic scheduling is done by software; automatic execution is done by hardware; human biology sits there and receives the treatment; and, then, data comes back out across 3D single-cell phenomics on whole tissues, single-cell sequencing, and deep proteomics.

Back into that outer AI layer, we integrate all the data we’ve gathered from these experiments. So, it’s this onion: biology; automation hardware; software; AI; and then the human expert layer around it that guides, “We should look at this therapeutic area or that one.”

Talking about biotech software, aren’t you going to have to, effectively, clone Benchling to make this work?

So that’s such an interesting thing, because there’s this perfect tradeoff that almost feels like cheating to exploit.

Before Vivodyne, I had spent so much time in front of a microscope that I got to know the software that we were using inside and out - every dial, every menu, everything. I kept wondering why this or that bit of functionality that I needed wasn’t there.

The reason is that the software has to cover all use cases. People imaging zebrafish, C. elegans; people studying cells in a dish at high throughput; people doing very low-throughput large-volume imaging… They have to cover the entire cross-section of users.

It becomes really hard to write software for that very generalized case, because you fall into a mindset of either doing everything to an okay level, or drowning in edge cases.

I think Benchling is a great company, by the way, with great software. But they have the same problem: there are so many different uses of that software, and Benchling has to be good at all of them.

We have a much simpler problem. Each one of our machines is handling human tissues that are approximately the same size, and are subject to a very similar experimental pipeline. We can optimize for a lot of shared things.

Our data model, which is quite expansive, still doesn’t have to cover the long tail of ‘who-knows-what’ applications and edge cases. We don’t need a placeholder for “how fast is my zebrafish fluttering its fins,” right?

All of the data structure design - how we store information; how we access it; how we map it to specific replicates, groups, sections, and studies - can all be built much faster, and made much more performant for our specific use case, given the tight focus of our pipeline.

It’s like some of the modern gradient descent solvers: they have to be extremely efficient, but you can still tractably optimize every operation without having to boil the ocean and make the entire operating system around them super-efficient, you just have to make one specific function very efficient. And if you do that, you don’t really have edge cases, and if you do, someone else can write that part.

It’s the same thing for our software: we can build an experimental data model that is very expansive and deep within our use case, because we have the accelerant of specificity and focus.

You’ve talked about hardware. Building physical systems is very expensive. What do you think about that side of the problem?

Yeah, hardware is definitely where a lot of cost goes. We do a lot of sequencing and proteomics, too, but a huge portion of our spend is on our automation systems.

We have to deploy experiments at scale and that’s what our automation systems are built around. In each one, we grow tens of thousands of tissues at a time, all independently.

It’s a factory for human data, a human datacenter, and that’s where all of our robotics and hardware costs go - into building something essential to solving the challenge of of producing human data before clinical trials.

I imagine the diseases you’re focusing on are less multi-system diseases, like autoimmunity or CNS. How are you thinking about those, and what are you specializing in?

The connection between multiple tissue types is actually not the hard part. At the size of these tissues, we can capture a lot: tumor growth, fibrotic conditions, inflammatory cascades, vasculitis risk across different organs - a lot of the things where the interesting part of the disease lives at the tissue scale.

The really hard cases, for this approach, are when you need the loads or mechanics of a full human body. Think broken bones and back pain.

So, for example, joint deterioration. A good example is when someone breaks a femur. When they start walking again, their gait is a little distorted. The pressure on one hip goes up, you get uneven wear on that joint, and you start to develop arthritis and knee pain. To model how that arthritis develops at the tissue level, you need the loading, the gait, and all of those mechanics. That’s at a physical scale our tissues don’t encompass.

Another example is heart valve regurgitation. Our tissues aren’t large enough to have a full heart valve flapping inside them.

But if you’re looking at steatotic liver disease, or fibrosis in the lungs or gut, or the growth of solid tumors and their interaction with the immune environment, or delivery of CAR-T cells to those tumors - all of these things happen in the inch-sized domain that our human tissues capture.

My rule of thumb is that, if you have to look through a microscope to understand what’s going on in the disease, it’s right up our alley. If you have to step on a scale or a doctor has to put on a stethoscope, that’s where the distinction shows up.

The driving factor in our breadth is really knowing how to handle a lot of engineering complexity at the very beginning of new R&D, so that we don’t have to face the incremental burden of constantly extending and modifying a system for every new application. That would be going back to arts and crafts.

If we take primary cells from patients with different stages of disease, the tissues we grow will natively exhibit the same physical differences. That comes along with the epigenome of those cells. Modeling a new disease, having that disease manifest in our tissues, just requires samples from patients who have that disease, and then using those patient-derived cell samples to reassemble the same diseased tissues.

The breadth of what we can cover is very wide as a product of the fundamental approach itself.

Could you give us an example of the kind of client you might work with; how would they use your platform; and what the output would be?

We work in program-level partnerships with pharma, and a lot of those start with pretty fundamental questions.

They might ask: What drives the progression of this disease in human patients? What are the underlying causes and drivers of disease severity on a biochemical level? What are the signaling pathways within cells, and between cells, that are actually driving the disease?

Most of the time, the development of disease doesn’t happen from pathways in a single type of cell; they happen in the nested communication between different cell types. You have these signaling loops that encompass many different cell types, and you have many such loops feeding back into the progression of severity, either by activating the disease or suppressing the systems that should keep the disease at bay.

The questions usually start at the target discovery and validation stage: what are the druggable targets that are true disease drivers, and which ones will have disease-modifying effect if we target them?

The way we begin is by doing functional genomics - CRISPR modifications to these tissues at massive scale. You can imagine hundreds of different experiments, where we knock down or increase the expression of target genes across different cell types in specific tissues.

If we think a receptor is implicated — for example, if there’s too much of some cytokine or inflammatory factor that’s activating fibroblasts into this pro-fibrotic phenotype — then we can knock out that receptor in the fibroblasts in the tissue, without touching other cells, or knock it out in 75% or 25%, and see how that modifies disease progression. Then we can upregulate that receptor, or look downstream for compensatory pathways: what goes up if we drug this thing? We can knock those out too, or overexpress them to check.

What you end up with is a very deep causal map: this tissue has these “dials” to turn, cytokines might be signaling through those pathways, and we can see that in our causal mapping. We can then solve for the fewest “number of dials that we need to turn” to bring the tissue back to healthy function.

The really powerful part is that we have the whole, complex tissue right there to experiment on - we don’t have to wait for a clinical trial or study. Those take forever; you can usually test one thing at a time, and they cost hundreds of millions of dollars, even if the thing only sort of works. With our system, we can close the loop in a matter of weeks and look not at one, but at thousands of permutations with each step.

So, it becomes this accelerated vehicle for experimental gradient descent to accurately map the weights of causal signaling networks, and then to figure out, from those mappings, which targets are the most viable to drug.

We identify the targets that are most likely to have strong disease-modifying effects in human tissues from very large swaths of patient demographics - different severities of disease, different genetic backgrounds - to get very high certainty in the disease-modifying effect of that target.

Then the added benefit is that we still have the tissues to test through the rest of the pipeline. So, once we have drug candidates against those validated targets, and we know which targets the drug should be hitting, we can test those compounds themselves and ask: Does the outcome match the predictions we had from the causal mapping? Does the drug that hits that target do what we expected it to do?

Beyond that, we can look across other tissues in the human body. We take diseased tissue and ask if it becomes healthy with treatment, but we also look at healthy tissues, like liver, bone marrow, and so on, and test whether they become affected by toxicity induced by that drug.

All of this is done without having to rely on some shallow or clinically distant dataset and then trying to project that up into human physiology.

How do customers currently pay to access your platform, and what pricing models are proving most resilient?

It spans a spectrum.

On one end, we have program-sized, conventional partnerships with pharma, where there are upfronts, and we collaborate across this entire preclinical pipeline with milestones and the typical biotech deal terms.

Then, on the other end, we have very high-urgency situations. Something unexpected happens very close to a clinical trial with some drug, and the pharma team is wondering: “Should we even progress this into the clinic? Is it actually going to be efficacious? What do we do with this red flag?”

In those cases, we can come in and generate human evidence around those unexpected outcomes, without starting a drug program from scratch.

So, overall, it spans both conventional program partnerships and more urgent engagements where the scope is, “We have this problem, and we need human evidence fast!”

When you’re evaluating additional automation versus adding capacity to the wet lab, how do you assess ROI?

If you go back to that requirements list, at the beginning, one of the key things is generality. You can imagine this dystopia where, for every different type of tissue or every different application, we’d have to rework the whole system.

The worst version of that would be that we design a new platform approach every time. Almost as bad would be rewriting a bunch of software, or saying, “It’s a new study, so we have to rewrite the whole automation plan by hand.”

Our approach is: how can we make one system that can generalize through as many different problems as possible?

The tissues themselves are agnostic to drug modality. The drugs flow through the blood vessels of the tissue. Whether it’s LNPs, antibodies, small molecules, or whichever vehicle, it’s going through those vessels like blood flow in our own bodies. The material inputs that define different tissue types are the patient-derived primary cells that we use to grow them. The robot arm doesn’t care which organ it’s handling.

ROI becomes much easier to assess, because it’s generalizable total capacity. We can allocate that capacity to partnered work, internal R&D, different modalities, or to different tissue types. Everything is driven by that fundamental question: what is total capacity, and how fast can we move through the experimental loop?

Can we, somehow, speed up imaging or different parts of the pipeline? The ROI calculation becomes: what data do we want to collect, and how does that fit into the growing capacity that we have?

We just opened a site in San Francisco to give us very significant capacity headroom. And we are shamelessly stealing scaling principles from data centers.

In the ‘60s and ‘70s, you had these room-sized mainframes that were state-of-the-art, but if you wanted a second one, it was like: does this new room have the same layout that the first one had? Is the same technician with ‘magic hands’ around to wire it? Everything was custom-built.

Where computational infrastructure has gone since then is modularity. You’re buying GPUs, and whether you have 10 racks or 10,000, the primary thing that scales is the interconnect between them. The computational units themselves scale through replication. Copy and paste more racks.

So, we asked: how do we build something like that for growing human tissues?

Our systems are roughly the size of two server racks side by side, around eight and a half feet tall; and they contain an entire biotech lab within them. There’s a fridge and freezer, a robot arm with multiple liquid-handling and gripper tools, a confocal microscope in each one, and the ability to sample and store, to do sequencing, and so on. The entire enclosure is a super high-quality air environment - HEPA filtration, 37 degrees, 5% CO2.

It’s a fully enclosed automation environment, but replicable, and we just copy-paste these units to build capacity. Each system can grow tens of thousands of tissues at a time, all independently. AI-designed study after study. It’s a factory for new human data.

So, rather than asking what demand for such-and-such application might be next year, we ask: what capacity do we forecast overall, and how do we scale for that?We’ve designed everything on the automation side in-house. I think it’s inevitable that this approach to producing human data, preclinically, becomes the fundamental preclinical infrastructure of pharma.

Of any country, China has had the most clinical trials this year, and more than the US, for the first time. Do you have any plans for expansion outside of the US?

I think this is another one of those things that goes back to first principles, and to this idea that clinical trials are currently — wrongly — the start of the actual experiment.

What China has been able to do is move very quickly, clinically, and just get into trials as fast as possible. You have state-sponsored programs, investment in biotech, and state control over hospitals that can enroll patients. So, the ability to enroll patients very early and efficiently into a drug trial is just at a level of risk that isn’t possible in the US.

In America, you don’t want to say, “Great, we’ve got a little bit of data, let’s bring them in and start testing.” Certainly not that quickly.

That initial difference is very hard ground to make up, if the real experiment continues to be the clinical trial, because, at that point, you have a competitor who can literally start earlier than you and get human data earlier, because they are willing to take different risks.

The only mechanism that I can see for overcoming that and maintaining the ability for American biotech companies to lead is to bring the ability to get human data even further back than that - all the way into the preclinical stage - and to run the whole process with human data from the outset.

I think the most powerful thing we can do for biotech in America is to crank up the dial and say: you don’t just have to pull clinical trials back, you can have that clinical-quality data from the very start of a program, without taking the same risks in patients.

That’s the alternate way of accelerating the mechanism, by gathering human data, preclinically, at massive scale, without testing too early on people, and using that to drive the whole pipeline forward.

Finally, what would success look like at Vivodyne? In three years’ time, where do you want the company to be?

If you look at what happens today in pharma, there’s a lot of work done preclinically to try to de-risk a new drug, but at the end of the day, you reach the clinical trial and it’s a fresh new experiment.

All the symptoms of that type of regime show up: roughly 95% failure rates, unexpected side effects or failures late in Phase II or Phase III. You’re experimenting all over again, instead of the trial being a final validation step.

I like to compare it to safety testing in cars. Today, simulations are so accurate that, if the real car behaves a little differently, it’s a great surprise. They’re so predictable that you can test one car, not 10,000 prototypes. And you don’t have to put a person inside one; they have great test dummies.

With clinical trials, the same thing can be true, if we can gather directly translatable data - without such a large inferential distance - at massive scale, before the trial. The trial then becomes a true validation step, instead of the beginning of the experiment.

Success, to me, looks like Vivodyne becoming the modern infrastructure for drug programs that seek to eradicate the extremely burdensome diseases of today, like fibrosis, cancers, inflammatory disease, autoimmunity... I see Vivodyne as the fundamental infrastructure needed to drive successful new therapies for such complex diseases that human trial-and-error cannot solve quickly enough.

Secondly, our ability to work both in collaboration with biotech giants and independently to contribute medicines ourselves is a core part of our value proposition.

Things are not moving fast enough in medicine. I think it’s one thing to jump on a stage and say, “We’re not going fast enough,” but with what we’ve built, we can tangibly and tractably do something about that.

What questions should we ask Andrei next? Let us know in the comments.

Biotech's new superstructure: Vivodyne's bid to turn human tissues into scalable, searchable data

Could you start by summarizing your career path and what led you to found Vivodyne?

What technical hurdles are you trying to overcome when you’re solving those problems but also trying to scale?

Talking about biotech software, aren’t you going to have to, effectively, clone Benchling to make this work?

You’ve talked about hardware. Building physical systems is very expensive. What do you think about that side of the problem?

I imagine the diseases you’re focusing on are less multi-system diseases, like autoimmunity or CNS. How are you thinking about those, and what are you specializing in?

Could you give us an example of the kind of client you might work with; how would they use your platform; and what the output would be?

How do customers currently pay to access your platform, and what pricing models are proving most resilient?

When you’re evaluating additional automation versus adding capacity to the wet lab, how do you assess ROI?

Of any country, China has had the most clinical trials this year, and more than the US, for the first time. Do you have any plans for expansion outside of the US?

Finally, what would success look like at Vivodyne? In three years’ time, where do you want the company to be?

Comments

Related

Bristol Myers Squibb’s Mezigdomide Delivers Progression-Free Survival Benefit in Phase 3 SUCCESSOR-2 Trial

BioNTech sets leadership transition as Sahin and Türeci plan move to new mRNA company

Vertex posts positive week 36 phase 3 IgAN data for Povetacicept

Agilent Technologies targets pathology growth with $950 million Biocare Medical acquisition