Error in the generative model of RNA-Seq?
1
0
Entering edit mode
7.3 years ago
roma ▴ 120

Happy holidays!

I am currently studying the paper RNA-Seq gene expression estimation with read mapping uncertainty.

If I understand correctly, in the generative model the probability of picking a read from a given transcript is equal to the abundance of that transcript: p(G_n=i|θ) = θ_i. Thus, if there is a 1kb transcript and a 10kb transcript expressed at the same level (TPM), their model would predict close to equal number of reads for the two transcripts.

However, due to fragmentation, in "real" RNA-Seq the 10kb transcript would result in 10x more fragments and thus 10x more reads.

Am I missing something here, or is the model in that paper wrong?

RNA-Seq • 1.3k views
ADD COMMENT
3
Entering edit mode
7.3 years ago
Rob 6.5k

The paper is correct. I think what you're missing is that the Theta track/estimate the nucleotide fractions, not the transcript fractions. So these parameters are not normalized by length, but this is done later to compute tau and, subsequently, TPM.

ADD COMMENT
1
Entering edit mode

This makes perfect sense. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2845 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6