Question: Which software do you use for RNA-seq data quality control?
12
gravatar for Christian
4.7 years ago by
Christian2.7k
Cambridge, US
Christian2.7k wrote:

I am specifically interested in (RNA-seq specific) quality metrics not delivered by FastQC, for example 5'-3' coverage bias of transcripts, percentage of reads mapping to exons vs. introns, ratio of known to novel splice junctions, rRNA contamination, strand specificity, GC bias, etc. 

Reading this post on Biostars, I learned about RSeQC and RNA-SeQC, which look like a fit. However, they did not attract a whole lot of citations since 2012, so I was wondering what other software is popular for RNA-seq data quality control, which clearly must run in dozens of RNA-seq data analysis pipelines around the world.

rna-seq quality-control qc • 13k views
ADD COMMENTlink modified 2.7 years ago by Michael Dondrup45k • written 4.7 years ago by Christian2.7k
1

I like RSeQC. You could check out this presentation: http://www.slideshare.net/mikaelhuss/all-bio-rnaseqqc for some other suggestions

ADD REPLYlink written 4.7 years ago by Mikael Huss4.6k
1

We used RNA Seq and some custom-written quality metrics for this paper:

http://www.nature.com/nmeth/journal/v10/n7/full/nmeth.2483.html

I would not trust citations as a metric at all because people do not cite software that is used in QA assessment, though they should. Most people only cite software that is used to produce a result in the paper.  I know that the Picard CollectRNASeqMetrics is run on most RNA dataset that comes off the Broad sequencers but I doubt the users cite it much.  They just use quality metrics to know if the data is good or not.  

ADD REPLYlink written 4.5 years ago by Michele Busby1.9k

"We used RNA Seq and some custom-written quality metrics for this paper:"

Did you mean you used RNA-SeQC ?

ADD REPLYlink modified 8 months ago • written 8 months ago by olechnwin0
5
gravatar for Malachi Griffith
4.7 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith17k wrote:

Picard has a module 'CollectRnaSeqMetrics' that is relevant.  Also, not specific to RNA-seq data, but other more generic options that can be useful include: FastQC, BAMstats, SAMstat, samtools flagstat, etc.

We also produce a variety of custom metrics from the 'junctions.bed' that you get along with TopHat alignments.  We have found that evaluating the degree to which your RNA-seq library represents the breadth of known exon-exon junctions across many loci can be an indicator of overall RNA-seq data quality. 

For example, you might ask the question, for how many genes do we observe reads supporting expression of at least 75% of the known exon-exon junctions of transcripts annotated for that locus?

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Malachi Griffith17k
2

This paper discusses some additional metrics: Quality Control of RNA-Seq Experiments.

ADD REPLYlink written 4.0 years ago by Malachi Griffith17k
2
gravatar for Michael Dondrup
2.7 years ago by
Bergen, Norway
Michael Dondrup45k wrote:

There is a problem with creating a single report per file, because this approach doesn't scale. Instead I would prefer that tools easily tabulate quality metrics for many samples and files, making it easy to get a quick glimpse of the results in batches. Also, the alignment statistics are often at least as important as quality metric as the read qualities.
Another requirement could be that most of the data is accessed remotely, and the QC tool would optimally work headless.

These tools were recommended recently here:

This is a perfect application for MultiQC by Phil Ewels.

I will try MultiQC, it can also summarize alignments statistics which is a very useful feature, has a nice web-page with introductory screen casts and documentation.

  • The installation via pip was very smooth (using local install option)
  • First test run using STAR logs went ok, little problem that no reports were found at first, because the default file names are hard-coded, but can be configured.
  • Check: http://multiqc.info/docs/#configuring-multiqc in case your pipeline renames reports.

AfterQC is another great QC tool for fastq.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Michael Dondrup45k

Wow, multiqc is amazing. Really worked at first try on a directory with hundreds of log-files.

ADD REPLYlink written 15 months ago by ATpoint13k

Yes. Another thumbs up for multiqc

ADD REPLYlink written 8 months ago by olechnwin0
1
gravatar for madk00k
3.4 years ago by
madk00k330
Heidelberg
madk00k330 wrote:

A novel version of open-source Qualimap tool provides additional aspects specific to RNA-seq data quality control analysis. Most importantly, now multi-sample data analysis is supported providing abilities to detect outliers. Here's a link to publication which includes detailed comparison of Qualimap2 to RSeQC and RNA-seq QC :

Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data

 

 

 

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by madk00k330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 839 users visited in the last hour