Question

how many read per sample give me good coverage for soybean rna sequensing with 1.1 GB genome length

1

Entering edit mode

9.2 years ago

z_heydarian • 0

how many read per sample give me good coverage for soybean rna sequensing with 1.1 GB genome length. I will use Illumina rRNA depleted plant stranded mRNA library prep for cdna liberary and i like to discover rare gene express the genome of soybean was sequenced.

RNA-Seq • 2.7k views

ADD COMMENT • link updated 9.2 years ago by WouterDeCoster 48k • written 9.2 years ago by z_heydarian • 0

0

Entering edit mode

9.2 years ago

WouterDeCoster 48k

If your interest is only in discovery and you are not going to compare samples you could consider the following: prepare a library, sequence it, check if the depth is sufficient, if not sequence again. One library prep will (mostly - check your protocol) be sufficient for loading multiple times.

ADD COMMENT • link 9.2 years ago by WouterDeCoster 48k

score 3 · Accepted Answer · 2016-05-10

Sequencing depth is usually referenced to be the expected mean coverage at all loci over the target sequences, in the case of RNASeq experiements assuming all transcripts have similar level of expression.

For researchers with a fixed budget, often a critical design question is wether to increase the sequencing depth at the cost of reduced sample numbers or to increase the sample size with limited coverage for each sample.

Necessary coverage is determined by the type of study, gene expression level, size of reference genome, published literature, and best practice defined by the scientific community.

C = LN / G C : coverage L : read length N : number of read G : haploid genome length

On Hiseq 2500 high output run mode, single flow cell for 2X100bp reads : 4 billion paired end read (claimed by illumina so might expect a bit lower), 0.5 Billion per lane.

C = 2 x100 x 500 000 000 / 1 100 000 000 = 91 X

So you should have about 91X of coverager per lane (~3000$) that you now need to divide into conditions and replicates. IF you haven't heard about randomization, replication, blocking and Sampling, now is a good time to do so.