Hello,
My group has performed RNA-Seq using Illumina Truseq RNA Exome kit, we performed alignments with HISAT2 and we noticed that a good portion of our samples had DNA contamination. So far, I have seen a lot of discussion on identifying the presence of DNA contamination in RNA-Seq data, but not a lot of discussion on what to do afterwards.
What we are most concerned about are contaminating DNA reads that align to the exome regions. Is there a way to salvage these samples by removing the contaminated DNA reads?
Below is an image from IGV showing the alignments of two samples.
Looking forward to any suggestions for this!
How many reads in total (percentage) map outside of annotated exons?