Question: Estimated Fragment Mean In Cuffdiff
1
gravatar for apt.university
6.4 years ago by
United States
apt.university70 wrote:

Dear all, Cuffdiff determines my fragment lenghts to be 188.34 with Std Dev of 58. I am analyzing a paired-end 100 bases library. Which means that according to tophat the mate-inner-dist should be 188 - (2 * 100) = -12. The sequencing center told me that the fragments' mean length is in fact 320 (which I thought was without the primers), so I initially set mate-inner-dist to 120 and I had 85% of the reads aligned (I got the number of aligned reads with samtools falgstat).

On the other hand, using an mate-inner-dist of -12 and Std. Dev. of 58 produces about 70% aligned reads. I have three issues: 1- If indeed my distance is -12, shouldn't my reads overlap by, on average, 12 bases -- I aligned few thousand sequences and none of them do. 2- I don't understand how come the % of aligned with mate-inner-dist of 120 is larger. 3- Are there any other ways of getting useful statistics about bam alignment other than using samtools (idxstats and flagstat)

Thanks for any suggestions you might be able to provide!

Madi

cuffdiff rna-seq • 2.2k views
ADD COMMENTlink modified 13 months ago by Biostar ♦♦ 20 • written 6.4 years ago by apt.university70
1
gravatar for Mikael Huss
6.4 years ago by
Mikael Huss4.6k
Stockholm
Mikael Huss4.6k wrote:

Yes, this (higher TopHat mapping rates with "wrong" mate inner distance) is a kind of mystery that has been observed by myself and others - there are some discussions on SeqAnswers about this that I don't have the time to locate at the moment (sorry about that - kind of in a rush). Briefly, setting the mate inner distance "too high" somehow seems to give higher mapping rates, just as you have observed.

Useful stats about BAM files other than samtools: try

ADD COMMENTlink written 6.4 years ago by Mikael Huss4.6k
0
gravatar for apt.university
6.4 years ago by
United States
apt.university70 wrote:

Thanks Mikael! I did find the thread on SeqAnswers and it seems that it is a long one, as you said, without any conclusions! I tried CollectAlignmentSummaryMetrics and it seems that it returns exactly the same value for alignments using distinct values of r. Furthermore, it reports a PFHQMEDIAN_MISMATCHES of 0.8 (80%), which seems rather suspicious! I'll have to dig more into this. Thanks again for taking the time to answer my question.

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by apt.university70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1288 users visited in the last hour