Confusion in HTSeq paired-end output count
1
1
Entering edit mode
9.6 years ago
Whoknows ▴ 960

Hi friends

I used HTSeq for creating count table (outputs) in paired-end file, actually I generated RPKM values from these count tables with below formula :

  • C = Number of reads mapped to a gene
  • N = Total mapped reads in the experiment
  • L = exon length in base-pairs for a gene
  • Equation = RPKM = (10^9 C)/(N L)

Is it right to use exact read count numbers of the count tables for RPKM ??

I mean, because they are paired-end files, this equation don't need to be correct by dividing values (C and N) on 2? OR not? OR HTSeq consider this issues too?

Thanks

HTSeq tophat • 3.3k views
ADD COMMENT
1
Entering edit mode
9.6 years ago

Technically you're getting FPKM rather than RPKM, but since htseq-count isn't going to count reads whose mate is aligning to a different feature anyway they should be the same (i.e, the counts would be doubled but so would the total number of mapped reads). To allay what's likely your biggest fear, the numbers produced by htseq-count when paired-end reads are used relates to the number of pairs or singletons rather then to individual reads, so there's no division by 2 needed

ADD COMMENT

Login before adding your answer.

Traffic: 3001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6