Question: Post alignment QC of RNA-seq data
2
gravatar for KVC_bioinfo
14 months ago by
KVC_bioinfo350
Boston
KVC_bioinfo350 wrote:

Hello,

I have aligned the RNA-seq data to the human genome and I used FASTQC for the pre-processing data. Can I use FASTQC for post alignment QC also? Or is there a better way of doing it?

Thank you in advance.

fastqc qc • 1.7k views
ADD COMMENTlink modified 14 months ago by Brian Bushnell16k • written 14 months ago by KVC_bioinfo350
3
gravatar for Friederike
14 months ago by
Friederike2.3k
United States
Friederike2.3k wrote:

RSeQC is used a lot, but I find QoRTs a bit more user-friendly, especially if you have numerous samples.

In addition, feeding the results of a) FastQC, b) STAR (or whatever aligner you've used), and c) featureCounts to MultiQC is already quite useful.

The typical things you want to look out for:

  • at least 80% alignment rate
  • not too many intronic/intergenic reads
  • even gene body coverage
ADD COMMENTlink written 14 months ago by Friederike2.3k

Could you suggest a publication where such parameters are described? Or the QC of RNA-seq (pre and post analysis) is described

ADD REPLYlink written 14 months ago by KVC_bioinfo350
2

I would expect that the papers of QoRTs and RSeQC may contain a discussion of that.

The classic resources for basic RNA-seq measures is the ENCODE recommendation although it's a bit dated by now. There may be more updated guidelines on their website, haven't checked in a while.

An interesting read regarding the importance of which annotation you choose is Zhao & Zhang (2015) BMC Genomics. Regarding gene body coverage, I believe Lahens et al. (2014) Genome Biology 15:R86 discussed that nicely.

If you really want to learn about all the details of RNA-seq pre-processing, you may want to have a look at the notes that I compiled for a class I teach. It's more than 80 pages and fairly detailed especially in regard to raw and aligned read handling. The github repo is this: https://github.com/friedue/course_RNA-seq2017

ADD REPLYlink written 14 months ago by Friederike2.3k

Hello,

Thank you very much. The introduction to RNA-seq on your GitHub page is really very helpful.

ADD REPLYlink written 13 months ago by KVC_bioinfo350

I should add that it's not completely straight-forward to ask for definite thresholds. Whether an experiment failed or has acceptable results will depend on the specific circumstances of your experiment and the biological question that you want to address. For example, if you specifically expect many unannotated transcripts to be present in your sample, then increased numbers of intergenic and/or intronic reads may not be worrisome, but expected.

ADD REPLYlink written 14 months ago by Friederike2.3k
1
gravatar for Kevin Blighe
14 months ago by
Kevin Blighe32k
Republic of Ireland
Kevin Blighe32k wrote:

FASTQC is primarily for pre-alignment and it takes as input FASTQ or FASTA files. After you perform your alignment, you should have produced a SAM or BAM file, which are not used as input for FASTQC.

The most common quality control metric that is used post-alignment is to check how many of your reads have aligned to the reference genome. The command samtools mpileup produces this.

ADD COMMENTlink modified 7 months ago • written 14 months ago by Kevin Blighe32k

I read in the manual for FASTQC that it takes the SAM and BAM input.

ADD REPLYlink written 14 months ago by KVC_bioinfo350

Yes, because some platforms produce these as unaligned files.

ADD REPLYlink written 14 months ago by Kevin Blighe32k

I am using STAR aligner

ADD REPLYlink written 14 months ago by KVC_bioinfo350
0
gravatar for Ron
14 months ago by
Ron820
United States
Ron820 wrote:

Post alignment ,you can use rseqc, http://rseqc.sourceforge.net/

It has very good modules,to work with BAM.

Also ,if you use STAR for alignment,please look out the log.final.out file for QC metrics.

ADD COMMENTlink modified 14 months ago • written 14 months ago by Ron820

Yes. I have used STAR aligner. It gave all the statistics in the log.final file. What is the next QC step I should perform?

ADD REPLYlink written 14 months ago by KVC_bioinfo350
  1. look at the STAR statistics
  2. run QoRTs or RSeQC
ADD REPLYlink written 14 months ago by Friederike2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 845 users visited in the last hour