Question: Error in the generative model of RNA-Seq?
0
gravatar for roma
4.1 years ago by
roma120
Ukraine
roma120 wrote:

Happy holidays!

I am currently studying the paper RNA-Seq gene expression estimation with read mapping uncertainty.

If I understand correctly, in the generative model the probability of picking a read from a given transcript is equal to the abundance of that transcript: p(G_n=i|θ) = θ_i. Thus, if there is a 1kb transcript and a 10kb transcript expressed at the same level (TPM), their model would predict close to equal number of reads for the two transcripts.

However, due to fragmentation, in "real" RNA-Seq the 10kb transcript would result in 10x more fragments and thus 10x more reads.

Am I missing something here, or is the model in that paper wrong?

rna-seq • 801 views
ADD COMMENTlink modified 4.1 years ago by Rob4.6k • written 4.1 years ago by roma120
3
gravatar for Rob
4.1 years ago by
Rob4.6k
United States
Rob4.6k wrote:

The paper is correct. I think what you're missing is that the Theta track/estimate the nucleotide fractions, not the transcript fractions. So these parameters are not normalized by length, but this is done later to compute tau and, subsequently, TPM.

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by Rob4.6k
1

This makes perfect sense. Thank you!

ADD REPLYlink written 4.1 years ago by roma120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1468 users visited in the last hour
_