Is it necessary to do FastQC before using RSEM?
1
0
Entering edit mode
4.6 years ago
John ▴ 270

Hi

When I see High profile journals, some of them ran FastQC, TrimGalore , Trinity for pre-processing fastq files of RNA seq reads, some of them didn't.

Is it really necessary to do preprocessing (or) we can use raw fastq files in RSEM?

Thank you in advance

RNA-Seq R alignment • 1.5k views
ADD COMMENT
1
Entering edit mode

I ran into a similar problem a few weeks ago. I tried an experiment where I picked a few FASTQ pairs with moderately high adapter content and ran them through my pipeline that involves RSEM+STAR both before and after adapter trimming, then evaluated the results using DESeq2. The results were not statistically significant at all. However, my pipeline does have a k-mer - based host/graft read separation algorithm before RSEM, so the results may not be 100% indicative of just RSEM/STAR's compensation techniques.

ADD REPLY
1
Entering edit mode

It really depends on the dataset. I have experimented with read trimming before STAR. Most of the time, the impact is very minor, but I have also seen instances where gene counts are substantially different.

ADD REPLY
3
Entering edit mode
4.6 years ago
GenoMax 141k

FastQC is a quality assessment program. It is not doing any changes to the data. So in a strict sense it is not necessary to run it. That said, FastQC provides a birds-eye view of your data and can alert you to possible issues (e.g. presense of adapter dimers, duplication in your data etc). Take results of FastQC in context of your experiment though. Failing a category on FastQC does not flag your data as automatically bad.

You will find this series of blog posts from FastQC author's of interest as you check on FastQC results.

Same with trimgalore or similar trimming program. Most aligners will handle presence of some adapter contamination and will drop those from alignments. If you are going to do any de novo analysis then it is imperative that you clean your data of extraneous sequence.

Trinity is only needed if you wish to de novo assemble your data set. If you have a genome/transcriptome available then you don't need to go that route at all.

ADD COMMENT
0
Entering edit mode

Thanks your answer is informative.

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.

Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 1781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6