Question

Where i can find sequencing reads(3rd generation sequencing) data set in which each read length is above 1000bp?

1

Entering edit mode

6.7 years ago

saranpons3 ▴ 70

Dear Members, I'm looking for downloading sequencing reads(3rd generation) whose length is above 1000bp. It would be helpful for me if anybody can provide me with proper links. Also, I would like to know that what should be the size of K when we assemble reads from 3rd generation sequencing technologies. I guess that K value should be large. Is my guess correct?

Reads • 2.2k views

ADD COMMENT • link updated 6.7 years ago by WouterDeCoster 47k • written 6.7 years ago by saranpons3 ▴ 70

1

Entering edit mode

You don't normally use kmer-based (or purely kmer-based) assembly for single-molecule sequencing reads; the error rate is too high. Instead, you use all-to-all alignment and consensus.

ADD REPLY • link 6.7 years ago by Brian Bushnell 20k

0

Entering edit mode

Dear Brian, Is all-to-all alignment and consensus mean Overlap-Layout-Consensus(OLC) approach?

ADD REPLY • link 6.7 years ago by saranpons3 ▴ 70

0

Entering edit mode

Dear Brian, I have read Kmer has many applications in many bioinformatics analysis (https://en.wikipedia.org/wiki/K-mer). So, I'm not concerned the length of Kmer only for assembly problem. Generally, for other bioinformatics applications, what would be the length of K-mers for the lengthier reads generated by 3rd generation sequencing machines such as nanopore and pacbio. Can we go for the kmer length above 520?

ADD REPLY • link 6.7 years ago by saranpons3 ▴ 70

1

Entering edit mode

As I said, kmers are unsuitable for long single-molecule read assembly. Other approaches like OLC or string graphs are used. You're certainly welcome to try k=520 with long reads, and see what happens. But typically people use string-based assemblers like Falcon or Celera.

ADD REPLY • link 6.7 years ago by Brian Bushnell 20k

0

Entering edit mode

Thanks for your answer. I'll get back to you if I have other questions related to this.

ADD REPLY • link 6.7 years ago by saranpons3 ▴ 70

score 0 · Answer 1 · 2017-08-06

0

Entering edit mode

6.7 years ago

WouterDeCoster 47k

A ton of Oxford Nanopore data from NA12878 is available here.

ADD COMMENT • link 6.7 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks. Your link was helpful.

ADD REPLY • link 6.7 years ago by saranpons3 ▴ 70

0

Entering edit mode

Hi WouterDeCoster

Any links for nanopore whole exome public datasets?

ADD REPLY • link 5.2 years ago by lakhujanivijay 5.8k

0

Entering edit mode

Nanopore sequencing is not suitable for exome sequencing.

ADD REPLY • link 5.2 years ago by WouterDeCoster 47k

0

Entering edit mode

can you elaborate on that ?

ADD REPLY • link 5.2 years ago by lakhujanivijay 5.8k

1

Entering edit mode

Whole exome sequencing does the following: shearing reads to short fragments, amplify with PCR, use target capture with oligonucleotides followed by another round of amplification. While all of these things work for nanopore sequencing it's really not optimal. Nanopore sequencing doesn't need PCR (preferentially no amplification used) and will generate 10kb and longer reads, which is by far longer than your average exon.