Question: RNA-seq quantification Stringtie, featurecounts and HTSeq - am I correct?
gravatar for k.kathirvel93
23 months ago by
k.kathirvel93260 wrote:

Hi EveryOne, I am new to RNA-seq data analysis, now am trying to compare different quantifiers, Stringtie, featurecounts and HTSeq . I have some questions, i am really happy if someone helps me.


  1. I have removed the genes which are have <99 read counts. Is that ok or should i go for 9?

  2. When i have removed <99 read counts genes i got 11802 genes from featurecounts , 11305 from HTSeq and 16502 from Stringtie.(Note : In stringtie, I have used for gene read counts conversion from FPKM values). Why stringtie gave more genes ? Is Stringtie results are genes or transcripts?

I have used ensembl GRCh38 fasta and gtf files.

rna-seq next-gen genome gene • 2.6k views
ADD COMMENTlink modified 23 months ago • written 23 months ago by k.kathirvel93260

Personally I feel that removing all with <99 is a bit stringent but there is no rule of thumb for this.

Can you post the exact cmdlines you're executing so we can see if there is something different in those?

ADD REPLYlink written 23 months ago by lieven.sterck8.5k

Here is the commands :

featureCounts -T 16 -p -g gene_name -a /home/kathirvel/Homo_sapiens.GRCh38.77.gtf -o /home/kathirvel/FeatureCounts/MAQC_Counts.csv /home/kathirvel/out.bam

htseq-count -i gene_name -m intersection-nonempty -f bam /home/kathirvel/out.bam /home/kathirvel/Homo_sapiens.GRCh38.77.gtf > /home/kathirvel/Counts.csv

stringtie -p 16 -e -G /home/kathirvel/Homo_sapiens.GRCh38.77.gtf -B -o /home/kathirvel/MAQC_ILM_BGI_A1_1_transcripts.gtf -A /home/kathirvel/gene_abundances.csv /home/kathirvel/out.bam
ADD REPLYlink written 23 months ago by k.kathirvel93260

cmds look OK at first sight.

Did you tried 'evaluating' the mode used by htseq? this can have influences on the end result and might be different in the software you applied

ADD REPLYlink written 23 months ago by lieven.sterck8.5k

Stringtie is performing reference-based transcriptome assembly, so most probably you also get counts for newly assembled gene/transcripts. You can use gffcompare to compare the differences between the original gtf file and the one produced by Stringtie to see how many new transcripts it has assembled and counted.

ADD REPLYlink written 23 months ago by grant.hovhannisyan2.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1795 users visited in the last hour