Question: Rna-Seq Data Variant Calling
gravatar for learnerforever
5.0 years ago by
learnerforever490 wrote:

Has anyone tried calling variants from RNA-seq data and comparing those with WGS/Exome sequencing variant calls in coding regions? I was curious to know if the same variant callers can be used on RNA-seq alignment (say TopHat alignments). Also, if there are tools that can predict RNA-editing or similar events.

tophat rna-seq • 8.6k views
ADD COMMENTlink modified 4.9 years ago by Obi Griffith14k • written 5.0 years ago by learnerforever490

If you're interested in inferring RNA-editing from RNA-seq, you should be sure to read the responses to the Li et al Science paper on that topic published recently in Science and commentary on that topic published on the Genomes Unzipped blog.

ADD REPLYlink written 5.0 years ago by David Quigley10k
gravatar for Obi Griffith
5.0 years ago by
Obi Griffith14k
Washington University, St Louis, USA
Obi Griffith14k wrote:

You should check out the SNVMix papers here and here. They developed and used their method on RNA-seq tumor data and compared to "ground-truth" of genotype arrays and WGS. They also showed their approach could identify RNA-editing events. And, they have a follow-up method for matched tumor-normal samples called JointSNVMix. Although I think the latter was developed more for exome-seq.

ADD COMMENTlink written 5.0 years ago by Obi Griffith14k
gravatar for Vitis
5.0 years ago by
New York
Vitis1.5k wrote:

We've done quite a few variant calling from mRNA-Seq data for EMS mutant identifications. But we haven't compared with WGS/Exom yet. We used BWA for mapping, and samtools as well as GATK pipeline for variant calling. Both yielded pretty consistent results. One thing turned out to be very important for our purpose, i. e. detecting high quality SNPs in the coding regions, is that you have to trim aggressively to remove bases of bad quality, even at the cost of losing coverage in some areas. With really stringent quality trimming, we've successfully identified several mutant alleles that can be verified by Sanger sequencing or restriction enzyme genotyping.

ADD COMMENTlink written 5.0 years ago by Vitis1.5k

Thanks, we are following a similar approach as well so good to know we are not alone :). What reference did you use for bwa alignments? Custom transcriptome using known transcripts (>150,000?) Or some trick to use spliced alignments using bwa?

ADD REPLYlink written 5.0 years ago by learnerforever490

For the mapping question, it is probably worth looking at:

ADD REPLYlink written 5.0 years ago by Sean Davis22k

We've been using predicted CDSs as references, because our system was not highly annotated, we ignored alternative transcription for the moment. I tried genome mapping, too, and got very similar results as you'll lose 5% junction reads.

ADD REPLYlink written 5.0 years ago by Vitis1.5k

what would you call consistent results? We routinely see over-representation of FPs near the splice junctions for rnaSeq SNV calls. And this is comparing the data to dna-seq making sure variants have good enough coverage of reads for a confident SNV call

ADD REPLYlink written 5.0 years ago by Bioinfosm610

Well, in our system, or more properly, evolutionary scale, CNVs are very rare. SNPs and indels are the vast majority of variant types. I have no idea and experience of CNVs.

ADD REPLYlink written 5.0 years ago by Vitis1.5k

do we not need to go for any normalization method before calling variations on mRNA Seq data

ADD REPLYlink written 4.6 years ago by bharati.mehani0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 684 users visited in the last hour