Mase-phi HPC

Tracking cancer with blood draws and AI.

Liquid biopsy

Here's the scene: a patient with lung cancer has just gotten their tumor removed. Operation is successful, and they are following their treatment plan [i.e immunotherapy, chemotherapy, radiation, etc.]. Few months in, things are looking good. Let's fast forward a 1.5 years post-op: the cancer is back, and is spreading.

This story is far too common for patients who put their all into fighting the nasty beast of cancer. We need to put an end to this. But how? What if we picked a few biomarkers [i.e mutations we know are associated with the tumor]? And what if we used blood samples instead of direct biopsies for a cheap, less invasive solution? This is longitudinal liquid biopsy tracking.

Challenges

The data is extremely noisy.
- Tumors shed genetic material when their cells die and reproduce. The DNA from these cells, known as circulating tumor DNA [ctDNA], floats in the blood and is not preserved well.
How do we know the tumor is going to keep the same biomarkers?
- We don't. Tumors rapidly evolve. We need to account for this.

Approach

First, we take an initial direct biopsy of the tumor during surgery [addressing our noise problem], giving us a clear picture of the tumor. Next, we can use the noisy, liquid biopsy samples taken at multiple time points to update this picture [accounting for evolution].

We'll use 20 biomarkers for tracking, but this number can be scaled up or down. More biomarkers = higher prices, but

We can think of this like getting your car washed. We'll first take the car through the big car wash, which takes most of the dirt off [initial, direct biopsy]. Next, we'll go in with the smaller tools like vaccums, windex, leather conditioner, etc. to get rid of the last bit of grime [longitudinal liquid biopsy samples].

We now have our high level approach, but what does this picture look like? What are we tracking instead of just biomarkers?

Tumors have many different cell types within them known as clones. These clones are made up of many mutations that make them unique. When a clone gains a favorable mutation that allows it to thrive, it will dominate. If we start killing off this clone, it could mutate into a different clone and gain resistance to the treatment [chemo, or immunotherapy]. Harsh environments will help evolution thrive.

Instead of tracking the biomarkers, let's track these clones and their evolution.

Understanding tumor evolution

How do we unravel the clones that make up a tumor? This is where clonal deconvolution comes in. We can convert a list of biomarkers [mutations] and their frequencies to a phylogenetic tree that maps out the different clones and how they evolve. This is the picture we are trying to clear up. Each node of the tree is a clone with mutations that make up the clone, and the edges are correspond to the evolutionary lineage of the clone.

Here's how this works now:

Initial tumor biopsy $\rightarrow$ clonal deconvolution $\rightarrow$ initial tree of clones

Now how do we use the liquid biopsy [blood] samples?

Tracking tumor evolution

We now have our initial tree from our high resolution, direct biopsy data. How do we update this picture now?

This is where the liquid biopsy samples come in. Even though they are noisy, we can still use them to update our initial, high resolution tree.

We now have a picture of our clonal population over time. We can begin to predict things like relapse and drug resistance [and much more] now that we can track the clonal frequencies, instead of just single biomarkers.

Scaling

This is an expensive, computationally expensive process with many moving parts. I built a streamlined, HPC compatible [Slurm] pipeline that can be used for this.

On top of this, I have scaled up a prototype method, created another lab member, for selecting the best subset of biomarkers based on clonal dyanmics [Mase-phi].