Question: HTSeq count versus summarizeOverlaps, mismathc of exon counts
0
gravatar for vit.filippo
2.4 years ago by
vit.filippo0 wrote:

Hello to everybody

I am a newbie and I've recently start analysing RNASeq data. I used respectively:

  • HTSeqcount:
    • dexseq_annotation.py;
    • dexseq_count.py;
  • summarizeOverlaps:
    • exonicParts= (txdb, aggregateGenes=FALSE);
    • se=summarizeOverlaps(exonicParts,bamfiles,mode="Union", ignore.strand=TRUE, singleEnd=TRUE, fragments=FALSE, inter.feature=FALSE)

I did it in order to obtain counts at the exon level as input of DEXSeq for assessing the exon usage. The problem is that I got very different counts in dependence of which program I used: in particular, summarizeoverlaps counts as uniquely mapped reads too many of them (I got 18.000.000 total unique reads for HTSeq vs 42.000.000 apparently unique reads for summarizeOverlaps). Could somebody explin me why it happens? I will thank you in advice

Filippo

rna-seq • 737 views
ADD COMMENTlink modified 2.4 years ago by geek_y9.4k • written 2.4 years ago by vit.filippo0

I think I might help you but could you please reformat your message in such a way that the list is better visible?

ADD REPLYlink written 2.4 years ago by Macspider2.8k
0
gravatar for Devon Ryan
2.4 years ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:

Do not reinvent the wheel by using R, as you've noticed you're likely to get the wrong results. The DEXSeq scripts are known to produce the correct results, do not use anything else.

The reason you get different results in R is because you're counting different alignments in a different manner.

ADD COMMENTlink written 2.4 years ago by Devon Ryan89k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1750 users visited in the last hour