RNA-Seq pipeline for clustering
2
0
Entering edit mode
7.4 years ago
lu.ne ▴ 70

Hi All,

I am currently performing RNA-Seq data analysis, the aim is to cluster individuals and so far I have been using this pipeline to obtain expression values (using reference and gtf files GRCh38.86): - cutadapt - STAR - cufflinks, using this command: cufflinks -p 24 -o output_folder_path -g path/Homo_sapiens.GRCh38.86.gtf path/Aligned.sortedByCoord.out.bam --library-type fr-firststrand - cuffnorm

It seems to be working fine but the cufflinks step takes a really long time and thus, I am looking for faster alternatives, I came across several tools (such as StringTie) but did not find a lot of information about the normalisation step.

Any suggestions would be greatly appreciated.

RNA-Seq • 1.7k views
ADD COMMENT
4
Entering edit mode
7.4 years ago

I would suggest that you ditch both cufflinks and stringTie and instead use featureCounts. Alternatively, one could also use salmon and skip the alignment step as well, if you don't care much about the exact alignments.

In general, only use things like stringTie (never use cufflinks, stringTie is meant to replace it) if you are explicitly interested in finding novel isoforms.

BTW, cuffnorm is used for stringTie too.

ADD COMMENT
0
Entering edit mode

It seems to be working just fine with featureCounts, thanks!

ADD REPLY
1
Entering edit mode
7.4 years ago
John Ma ▴ 310

Use Salmon, then import the reads to R with tximport, then use VariationStablizingTransformation in DeSeq2. There's no problem in using TPM for correlation, but I consider the VST transformation look better.

ADD COMMENT

Login before adding your answer.

Traffic: 1394 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6