Question: Post alignment QC of RNA-seq data
1
gravatar for KVC_bioinfo
10 months ago by
KVC_bioinfo310
WA, USA
KVC_bioinfo310 wrote:

Hello,

I have aligned the RNA-seq data to the human genome and I used FASTQC for the pre-processing data. Can I use FASTQC for post alignment QC also? Or is there a better way of doing it?

Thank you in advance.

fastqc qc • 1.1k views
ADD COMMENTlink modified 10 months ago by Brian Bushnell15k • written 10 months ago by KVC_bioinfo310
3
gravatar for Friederike
10 months ago by
Friederike1.8k
United States
Friederike1.8k wrote:

RSeQC is used a lot, but I find QoRTs a bit more user-friendly, especially if you have numerous samples.

In addition, feeding the results of a) FastQC, b) STAR (or whatever aligner you've used), and c) featureCounts to MultiQC is already quite useful.

The typical things you want to look out for:

  • at least 80% alignment rate
  • not too many intronic/intergenic reads
  • even gene body coverage
ADD COMMENTlink written 10 months ago by Friederike1.8k

Could you suggest a publication where such parameters are described? Or the QC of RNA-seq (pre and post analysis) is described

ADD REPLYlink written 10 months ago by KVC_bioinfo310
2

I would expect that the papers of QoRTs and RSeQC may contain a discussion of that.

The classic resources for basic RNA-seq measures is the ENCODE recommendation although it's a bit dated by now. There may be more updated guidelines on their website, haven't checked in a while.

An interesting read regarding the importance of which annotation you choose is Zhao & Zhang (2015) BMC Genomics. Regarding gene body coverage, I believe Lahens et al. (2014) Genome Biology 15:R86 discussed that nicely.

If you really want to learn about all the details of RNA-seq pre-processing, you may want to have a look at the notes that I compiled for a class I teach. It's more than 80 pages and fairly detailed especially in regard to raw and aligned read handling. The github repo is this: https://github.com/friedue/course_RNA-seq2017

ADD REPLYlink written 10 months ago by Friederike1.8k

Hello,

Thank you very much. The introduction to RNA-seq on your GitHub page is really very helpful.

ADD REPLYlink written 9 months ago by KVC_bioinfo310

I should add that it's not completely straight-forward to ask for definite thresholds. Whether an experiment failed or has acceptable results will depend on the specific circumstances of your experiment and the biological question that you want to address. For example, if you specifically expect many unannotated transcripts to be present in your sample, then increased numbers of intergenic and/or intronic reads may not be worrisome, but expected.

ADD REPLYlink written 10 months ago by Friederike1.8k
1
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe24k
Republic of Ireland
Kevin Blighe24k wrote:

FASTQC is primarily for pre-alignment and it takes as input FASTQ or FASTA files. After you perform your alignment, you should have produced a SAM or BAM file, which are not used as input for FASTQC.

The most common quality control metric that is used post-alignment is to check how many of your reads have aligned to the reference genome. The command samtools mpileup produces this.

ADD COMMENTlink modified 3 months ago • written 10 months ago by Kevin Blighe24k

I read in the manual for FASTQC that it takes the SAM and BAM input.

ADD REPLYlink written 10 months ago by KVC_bioinfo310

Yes, because some platforms produce these as unaligned files.

ADD REPLYlink written 10 months ago by Kevin Blighe24k

I am using STAR aligner

ADD REPLYlink written 10 months ago by KVC_bioinfo310
0
gravatar for Ron
10 months ago by
Ron790
United States
Ron790 wrote:

Post alignment ,you can use rseqc, http://rseqc.sourceforge.net/

It has very good modules,to work with BAM.

Also ,if you use STAR for alignment,please look out the log.final.out file for QC metrics.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Ron790

Yes. I have used STAR aligner. It gave all the statistics in the log.final file. What is the next QC step I should perform?

ADD REPLYlink written 10 months ago by KVC_bioinfo310
  1. look at the STAR statistics
  2. run QoRTs or RSeQC
ADD REPLYlink written 10 months ago by Friederike1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 654 users visited in the last hour