Question

Total RNA sequencing issue

0

Entering edit mode

7.7 years ago

entfernung01 • 0

Hi all,

I am planning to investigate rare mutations of a few of unknown viral genomes by RNA-seq. The total RNA of this virus is 12kb and my supervisor advised me to do an ultra-deep sequencing (preferably as deep as 10,000x). We plan to prepare the libraries using TruSeq stranded total RNA kit and perform the sequencing by NextSeq 500. My question is what read length should I use? a 2x75 or 2x150? I tried to calculate using the formula: coverage=(read count*read length)/total genome size, but, I wasn't sure if I did it correctly.

Here are my calculations:

In ideal situation: Coverage=10,000x Kit I use will be high-throughput kit, and hence, the read count generated would roughly be:800 millions bp Total genome size:12kb (12,000bp)

coverage=(read count*read length)/total genome size

10000=(800x10e6)read length/12000 Read length= [1000012000]/800x10e6 = 0.15 The calculation doesn't seem right to me. I sincerely hope you guys can help me out in this as I am new to RNA-seq. Thank you in advance.

RNA-Seq next-gen-sequencing • 1.8k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 7.7 years ago by entfernung01 • 0

0

Entering edit mode

Since the viral sequences will not be present standalone (I assume you will have some other genome contaminating the sample) you would need to account for that in your calculation. So if you expect only 25% of your data to be viral then running the sample on NexrSeq may not be out of ordinary.

ADD REPLY • link 7.7 years ago by GenoMax 141k

score 1 · Accepted Answer · 2016-08-15

1

Entering edit mode

7.7 years ago

Asaf 10k

One run of NextSeq500 will give you 400 Million reads in optimal conditions (not 800). Dedicating one run to one viral genome is an overkill. If you'll run 150x2 reads you'll get 300*400M bp = 120,000,000,000 bp which will produce a coverage of 10,000,000x. I recommend you to find someone that runs a lane of HiSeq/NextSeq and ask for a small fraction of his/her run.

ADD COMMENT • link 7.7 years ago by Asaf 10k

0

Entering edit mode

I assumed the 800 million with counting the reads in a pair separately (at least that's the only way the number makes sense to me).

ADD REPLY • link 7.7 years ago by Devon Ryan 104k

0

Entering edit mode

Dear Asaf, thank you heaps for your info and suggestion. Devon was right. I meant reading in a pair. Sorry for the confusion. What important to me is I can now be sure that my calculation was right.

ADD REPLY • link 7.7 years ago by entfernung01 • 0

0

Entering edit mode

You can use Illumina's calculator: http://support.illumina.com/downloads/sequencing_coverage_calculator.html to test other options. I still suggest to find someone that will give you some reads, your library can be used instead of some phiX library (not all though) if someone has a low complexity library to run.

ADD REPLY • link 7.7 years ago by Asaf 10k

score 1 · Accepted Answer · 2016-08-15

1

Entering edit mode

7.7 years ago

Devon Ryan 104k

Your calculation is correct, a NextSeq is totally overkill for such a small genome unless you plan to sequence a bunch of different samples at once. Look into the pricing on a MiSeq. Our last MiSeq run had a bit more than 20 million fragments sequenced, so if you did 2x150 with that you'd end up with ~50000x coverage.

ADD COMMENT • link 7.7 years ago by Devon Ryan 104k

0

Entering edit mode

Dear Devon,

Thank you for your feedback and suggestion. That gives me an idea. Perhaps I should suggest to my supervisor that to run a few more libraries in a lane and to use a different platform. =)

ADD REPLY • link 7.7 years ago by entfernung01 • 0