Question: RNA-seq quantification Stringtie, featurecounts and HTSeq - am I correct?
0
gravatar for k.kathirvel93
5 months ago by
k.kathirvel93190
India
k.kathirvel93190 wrote:

Hi EveryOne, I am new to RNA-seq data analysis, now am trying to compare different quantifiers, Stringtie, featurecounts and HTSeq . I have some questions, i am really happy if someone helps me.

Questions:

  1. I have removed the genes which are have <99 read counts. Is that ok or should i go for 9?

  2. When i have removed <99 read counts genes i got 11802 genes from featurecounts , 11305 from HTSeq and 16502 from Stringtie.(Note : In stringtie, I have used PrepDE.py for gene read counts conversion from FPKM values). Why stringtie gave more genes ? Is Stringtie results are genes or transcripts?

I have used ensembl GRCh38 fasta and gtf files.

rna-seq next-gen genome gene • 606 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by k.kathirvel93190
1

Personally I feel that removing all with <99 is a bit stringent but there is no rule of thumb for this.

Can you post the exact cmdlines you're executing so we can see if there is something different in those?

ADD REPLYlink written 5 months ago by lieven.sterck4.5k

Here is the commands :

featureCounts -T 16 -p -g gene_name -a /home/kathirvel/Homo_sapiens.GRCh38.77.gtf -o /home/kathirvel/FeatureCounts/MAQC_Counts.csv /home/kathirvel/out.bam

htseq-count -i gene_name -m intersection-nonempty -f bam /home/kathirvel/out.bam /home/kathirvel/Homo_sapiens.GRCh38.77.gtf > /home/kathirvel/Counts.csv

stringtie -p 16 -e -G /home/kathirvel/Homo_sapiens.GRCh38.77.gtf -B -o /home/kathirvel/MAQC_ILM_BGI_A1_1_transcripts.gtf -A /home/kathirvel/gene_abundances.csv /home/kathirvel/out.bam
ADD REPLYlink written 5 months ago by k.kathirvel93190

cmds look OK at first sight.

Did you tried 'evaluating' the mode used by htseq? this can have influences on the end result and might be different in the software you applied

ADD REPLYlink written 5 months ago by lieven.sterck4.5k
1

Stringtie is performing reference-based transcriptome assembly, so most probably you also get counts for newly assembled gene/transcripts. You can use gffcompare to compare the differences between the original gtf file and the one produced by Stringtie to see how many new transcripts it has assembled and counted.

ADD REPLYlink written 5 months ago by grant.hovhannisyan1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 914 users visited in the last hour