Question: read length of sequence
1
gravatar for Bulbul Ahmed
2.4 years ago by
Bulbul Ahmed20
United States
Bulbul Ahmed20 wrote:

i have found 30-50 bp is the length of short reads but do not know about long reads......... what is the appropriate length of long reads in an average???

rna-seq assembly • 1.8k views
ADD COMMENTlink modified 2.4 years ago by igor7.1k • written 2.4 years ago by Bulbul Ahmed20
2

from which sequencing platform? and how did you found that 30 bp is length of short reads?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Medhat8.0k
1

30-50 bp is where we started at a decade ago. We are way past that stage now even on the short read technologies (e.g. Illumina, Ion).

With technologies like 10x genomics/Illumina sequencing, the reads (contigs that come out of the process) can be on the order of hundreds of kb.

ADD REPLYlink written 2.4 years ago by genomax59k
6
gravatar for igor
2.4 years ago by
igor7.1k
United States
igor7.1k wrote:

See this nice graphic: enter image description here

from: https://flxlexblog.wordpress.com/2016/07/08/developments-in-high-throughput-sequencing-july-2016-edition/

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by igor7.1k

A nice graphic indeed.

ADD REPLYlink written 2.4 years ago by jiskroyer20
2
gravatar for WouterDeCoster
2.4 years ago by
Belgium
WouterDeCoster35k wrote:

30-50 bp is very short, inappropriately short for most applications. Long reads on PacBio platform are ~12kb if I'm not mistaken, reads using Oxford Nanopore sequencers can be tens of kb's (easily 20kb) and maximum mapping reads that has been reported is >150kb.

ADD COMMENTlink written 2.4 years ago by WouterDeCoster35k
1

Most sequencing today is 50bp, which is long enough for most applications.

ADD REPLYlink written 2.4 years ago by igor7.1k
2

That does not match my observations... I think it depends on your specific field. I rarely encounter reads <150 bp.

ADD REPLYlink written 2.4 years ago by Brian Bushnell16k
1

Sure. Specific fields can vary. Some fields are also okay with Sanger. I meant overall.

You can check something like SRA for exact stats.

Or you can do a back of the envelope calculation. Most sequencing is Illumina. Most Illumina sequencing is HiSeq. Most HiSeqs do not support 150bp reads.

ADD REPLYlink written 2.4 years ago by igor7.1k
1

We usually use MiSeq and NextSeq, either 2x 250bp or 2x 300bp and respectively 2x 150bp. But I guess 2x 75bp is quite common for (RNA-seq on) HiSeq.

ADD REPLYlink written 2.4 years ago by WouterDeCoster35k

50 bp runs are most economical option at majority of the providers. So they are actually pretty popular with users.

ADD REPLYlink written 2.4 years ago by genomax59k

30-50 bp is very short, inappropriately short for most applications.

Not sure I agree with that... For ChIP-Seq and RNA-Seq for gene expression (no de novo assembly or splicing) the advantage of going above 50 I think is quite small. Also for bisulfite sequencing above 50 you don't gain much in mappability. (I'm referring to mouse or human genomes). The of course, the longer the better...

ADD REPLYlink written 2.4 years ago by dariober9.7k
1

Well since TopHat2 is designed for reads starting from 75bp, I wouldn't go much lower than that. I don't have to convince your that longer reads will contribute to better alignment.

ADD REPLYlink written 2.4 years ago by WouterDeCoster35k
1

Of course other things being equal you would go for longer reads. Still I'm quite convinced that in most differential gene expression experiments 50 bp reads are going to be substantially similar to 75+ bp. The gain in better alignment is probably minimal. If for example money is a limiting factor, I would definitely prefer to sequence shorter reads but do more replicates or more meaningful follow up experiments. If tophat2 "refuses" to align shorter reads, which I kind of doubt, just choose a different aligner. Again, I'm just talking in general terms...

ADD REPLYlink written 2.4 years ago by dariober9.7k

I don't think that anyone is arguing that longer is not better. If you pick a random RNA-seq study from the last year, it will be 50bp sequencing if it's basic differential gene expression (not something more complex like splicing or de novo transcriptome assembly).

ADD REPLYlink written 2.4 years ago by igor7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1021 users visited in the last hour