RNA-Seq with DNA contamination, any way to salvage the data?
1
1
Entering edit mode
11 months ago
Rina ▴ 10

Hello,

My group has performed RNA-Seq using Illumina Truseq RNA Exome kit, we performed alignments with HISAT2 and we noticed that a good portion of our samples had DNA contamination. So far, I have seen a lot of discussion on identifying the presence of DNA contamination in RNA-Seq data, but not a lot of discussion on what to do afterwards.

What we are most concerned about are contaminating DNA reads that align to the exome regions. Is there a way to salvage these samples by removing the contaminated DNA reads?

Below is an image from IGV showing the alignments of two samples.

Looking forward to any suggestions for this!

IGV_HISAT2_alignments

contamination RNA-Seq DNA • 1.1k views
ADD COMMENT
0
Entering edit mode

How many reads in total (percentage) map outside of annotated exons?

ADD REPLY
2
Entering edit mode
11 months ago

Depending on your goal... you could certainly align to the transcriptome and discard the unaligned reads. Then realign the remainder to the genome. Of course, that assumes you're not trying to discover new exon junctions.

In that image I don't see any evidence of DNA contamination, though. If there was, I'd expect to see reads mapping at random locations throughout the intron, which is not happening. Instead, I'm seeing lots of PCR duplicates and reads that align into the introns with indels, and possibly the typical rainbow of SNPs indicative of a bad alignment, but are not present in the image because presumably your mismatches are not being colored in the reads. Maybe this is a situation where usually one splice site is used, but occasionally another one a little farther out is used? Or maybe the reads are just being misaligned for some reason and they actually do span the intron? Turning on SNP coloring might be handy.

ADD COMMENT
0
Entering edit mode

Indeed, the IGV screenshot is fine. This is why I was asking how many reads as % are outside of exons. Actually, the only way to confidently know that a read is RNA is if it spans an exon-exon junction or the paired-end mate clearly jumps an intron due to splicing. Pure exonic or intronic reads can come from DNA as well, so it is really a matter if you have reads from outside annotated exons and gene bodies.

ADD REPLY

Login before adding your answer.

Traffic: 1154 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6