Question: How many reads should I expect for paired end reads when coverage = 30 million?
0
gravatar for Kristin Muench
5.0 years ago by
United States
Kristin Muench530 wrote:

Hello,

My lab ordered paired end sequencing, and we received a reported coverage of 30 million reads per sample.

Just to confirm - this means that there are 30 million reads across both directions? So, 15 mil per end in the paired end, so after alignment with TopHat2/counting with htseq-count, I should expect there to be about 15 million reads (i.e., read-pairs) for each sample?

Or should I expect to see 30 million reads, representing 30 million pairs/60 million total ends?

Thank you for the sanity check!

rna-seq • 3.8k views
ADD COMMENTlink modified 2.1 years ago by Biostar ♦♦ 20 • written 5.0 years ago by Kristin Muench530

Coverage usually has a different meaning.

ADD REPLYlink written 5.0 years ago by h.mon31k

Oh, excuse me! I meant that the total number of sequence reads = 30 million. It was unclear from the sequencing company if this meant 30 mil per direction, or 30 mil altogether.

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by Kristin Muench530

It's a really important question to ask up front when you get contract sequencing done. "Is that reads, or read pairs?" - as obviously the latter is half the former.

ADD REPLYlink written 5.0 years ago by Daniel Swan13k

It's most likely 15M per end, which is on the low end. As reads can be of varying lengths, I prefer to measure and be quoted by G of bases.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Eric Lim1.7k
3
gravatar for Antonio R. Franco
5.0 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.5k wrote:

It should be 15 million per end

ADD COMMENTlink written 5.0 years ago by Antonio R. Franco4.5k
5

As this post is warmed up in 2018, I strongly argue against the word should in this context. Rather than that, call the facility and ask, making sure that everyone is on the same page. I have witnessed so much confusion, even within our group where we typically know the vocabulary of each other, when talking about reads, coverage, depth, read number vs. fragment number etc.

ADD REPLYlink written 2.2 years ago by ATpoint40k
1

Thanks! Actually, since we have a bit of lived experience since this post was first made, I can share our experience: indeed, there was a miscommunication with the facility - what we meant was thirty million reads for analysis, but sixty million total/paired end reads. We ended up with thirty million total, and fifteen million functional coverage. We later re-sequenced the data at the appropriate depth and the data made so, so much more sense. So - two votes for calling your facility and making sure everyone is on the same page!

ADD REPLYlink written 2.2 years ago by Kristin Muench530
1
gravatar for grant.hovhannisyan
2.2 years ago by
grant.hovhannisyan2.0k wrote:

The post is quite old, but I see some confusion here. The read numbers might be different from facility to facility. For example, here at CRG, if you order 30mln paired-end reads, you get 30 mln per each mate. And I think this approach makes more sense (especially in case of RNAseq) since paired end sequencing is performed by sequencing the same fragment, but from both ends, which doesn't add up to expression levels.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by grant.hovhannisyan2.0k

Indeed, that was the case! (see above) Fortunately we were able to re-sequence this dataset at an appropriate depth.

ADD REPLYlink written 2.2 years ago by Kristin Muench530

good to hear a happy end :)

ADD REPLYlink written 2.2 years ago by grant.hovhannisyan2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1274 users visited in the last hour