In the early months of the COVID-19 pandemic, healthcare workers analyzing test results began noticing something strange: patients who had already recovered from COVID-19 would sometimes inexplicably test positive on a PCR test weeks or even months later.
Although people can catch COVID-19 for a second time, this did not appear to be the case for these patients; no live viruses were isolated from their samples, and some studies found these false positive results even while holding participants in quarantine. Also, RNAs generally have a short life — most only stick around for a few minutes — so it was unlikely for positive tests to be the result of residual RNAs.
Now, a new paper from the lab of Whitehead Institute Member and MIT professor of biology Rudolf Jaenisch may offer an answer to why some patients continue to test positive after recovery from COVID-19. In the paper, published online May 6 in the Proceedings of the National Academy of Sciences, Jaenisch and collaborators show that genetic sequences from the RNA virus SARS-CoV-2 can integrate into the genome of the host cell through a process called reverse transcription. These sections of the genome can then be “read” into RNAs, which could potentially be picked up by a PCR test.
SARS-CoV-2 is not the only virus that integrates into the human genome. Around eight percent of our DNA consists of the remnants of ancient viruses. Some viruses, called retroviruses, rely on integration into human DNA in order to replicate themselves. “SARS-CoV-2 is not a retrovirus, which means it doesn’t need reverse transcription for its replication,” says Whitehead Institute postdoc and first author Liguo Zhang. “However, non-retroviral RNA virus sequences have been detected in the genomes of many vertebrate species, including humans.”
With this in mind, Zhang and Jaenisch began to design experiments to test whether this viral integration could be happening with the novel coronavirus. With the help of Jaenisch lab postdoc Alexsia Richards, the researchers infected human cells with coronavirus in the lab and then sequenced the DNA from infected cells two days later to see whether it contained traces of the virus’ genetic material.
To ensure that their results could be confirmed with different methodology, they used three different DNA sequencing techniques. In all samples, they found fragments of viral genetic material (though the researchers emphasize that none of the inserted fragments was enough to recreate a live virus).
Zhang, Jaenisch and colleagues then examined the DNA flanking the small viral sequences for clues to the mechanism by which they got there. In these surrounding sequences, the researchers found the hallmark of a genetic feature called a retrotransposon.
Sometimes called “jumping genes,” transposons are sections of DNA that can move from one region of the genome to another. They are often activated to “jump” in conditions of high stress or during cancer or aging, and are powerful agents of genetic change.
One common transposon in the human genome is called the LINE1 retrotransposon, which is made up of a powerhouse combination of DNA-cutting machinery and reverse transcriptase, an enzyme that creates DNA molecules from an RNA template (like the RNA of SARS-CoV-2).
“There’s a very clear footprint for LINE1 integration,” Jaenisch says. “At the junction of the viral sequence to the cellular DNA, it makes a 20 base pair duplication.”
Besides the duplication, another feature as evidence for LINE1-mediated integration is a LINE1 endonuclease recognition sequence. The researchers identified these features in nearly 70 percent of the DNAs that contained viral sequences, but not all, suggesting that the viral RNA may be integrating into cellular DNA via multiple mechanisms. “The fraction of cells which have the integrating with could be very small,” says Jaenisch. “But even if it’s rare, there are more than 140 million people who have been infected already, right?”
To screen for viral integration outside of the lab, the researchers analyzed published datasets of RNA transcripts from different types of samples, including COVID-19 patient samples. With these datasets, Zhang and Jaenisch were able to calculate the fraction of genes that were transcribed in these patients’ cells which contained viral sequences that could be derived from integrated viral copies. The percentage varied from sample to sample, but for some, a relatively large fraction of viral transcripts seem to have been transcribed from viral genetic material integrated into the genome.
A previous draft of the paper with this finding was published online on the preprint server bioRxiv. However, recent research revealed that at least some of the viral-cellular reads could be the product of misleading artefacts of the RNA sequencing method. In the present paper, the researchers were able to eliminate these artefacts that could have been obscuring the results.
Instead of simply tallying transcripts that contained viral material, the researchers looked at which direction the transcripts had been read. If the viral reads were the result of live viruses or existing viral RNAs in the cell, the researchers would expect that most of the viral transcripts would have been read in the correct orientation for the sequences in question; in acutely infected cells in culture, more than 99 percent are in the correct orientation. If the transcripts were the product of random viral integration into the genome, however, there would be a near 50-50 split — half the transcripts would have been read forwards, the other half backwards, relative to the host genes. “This is what we saw in some patient samples,” says Zhang. “It suggests that much of the viral RNA in some samples could be transcribed from integrated sequences.”
Because the dataset they used was quite small, Jaenisch emphasizes that more information is needed to establish exactly how common this phenomenon is in real life and what it might mean for human health.
It is possible that only a very few human cells experience any kind of viral integration at all. In the case of another RNA virus that integrates into the host cell genome, only a fraction of a percent of infected cells (between .001 and .01) contained integrated viral DNA. For SARS-CoV-2, the frequency of integration in humans is still unknown. “The fraction of cells which have the integrating with could be very small,” says Jaenisch. “But even if it’s rare, there are more than 140 million people who have been infected already, right?”
In the future, Jaenisch and Zhang plan to investigate whether the fragments of SARS-CoV-2 genetic material could be made into proteins by the cell. “If they do, and trigger immune responses, it may provide continuous protection against the virus,” Zhang says.
They also hope to investigate whether these integrated sections of DNA could be partly to blame for some of the long-term autoimmune consequences that some COVID-19 patients experience. “At this point, we can only speculate,” says Jaenisch. “But one thing we do think we can explain is why some patients are long-term PCR positive.”