I am using tophat and cufflinks pipeline to do RNA seq analysis on my data. I am new to RNA seq. I guess there might be some batch effect in my RNa seq data. I am not sure how can I detect and correct for batch effect? Do you have any idea about this?
Firstly, visualise your problem with a PCA - this will give you clues as to where your primary variation is coming from, and if a batch effect is clearly present. Try one of the following methodologies:
Stringtie and DESeq2
Stringtie is an alternative to Cufflinks and has a mode, and script called prepde.py that will prepare Stringtie quantifications for analysis in DESeq2 (another popular analysis framework that can deal with batch effects. See Quick Start section for a note on batch effects.
I thought it was only Tophat and not Cufflinks/Cuffdiff that was deprecated? Cole Trapnell (the first author of Cufflinks still uses it frequently). You can simply sup TopHat for Hisat.
That's fair, Cufflinks (Faux Tiling) and StringTie (Network Flow) use different methodologies, so you're right that TopHat can be substituted with HISAT2 or STAR. Bit of an oversight on my part.
Batch effects are typically found via MDS or PCA plots and (hierarchical) clustering where samples clusters differently than you would expect.
Batch effects can (to my knowledge) not be corrected for within Cufflinks/CuffDiff so you would need to re-quantify the merged gtf file with tools such as Kallisto or Salmon (I have written about RNAseq quantification choices including all appropriate links here) and then do your analysis with another DE tool such as DESeq2 or edgeR such as this tutorial describes.
Please note:
If you use Cufflinks/Cuffdiff you need to have a very good reason (see quantification discussed here (again))
Isoform level analysis such as the one you have with Cufflinks also allows you to do analysis of e.g. isoform switches - something my R package IsoformSwitchAnalyzeR can help you with. You can find examples of what type of analysis you can do in this section of the vignette.
See comment from the Tophat author