Question: Variable Length Reads In De Novo Assembly
4
gravatar for Pauln
9.2 years ago by
Pauln60
Pauln60 wrote:

Hi Are there any de novo assemblers that will allow for variable length reads in one input file?

I have rna-seq transcriptome data and want remove adaptors (or parts of them) found in each read. The only way I can think of doing this is to remove the adaptors and write them back to a file. The problem is that there will be a range of read lengths in this file. I was hoping to avoid having to split this file into seperate files each containing same length reads. Hopefully this makes sense. Apologies for not explaining more clearly.
assembly • 3.1k views
ADD COMMENTlink modified 8.5 years ago by Ryan Thompson3.4k • written 9.2 years ago by Pauln60

do you mean variable-trimmed reads or reads from different sequencers (sanger, illumina, 454)

ADD REPLYlink written 9.2 years ago by Jeremy Leipzig18k

do you mean variable quality-trimmed reads or reads from different sequencers (sanger, illumina, 454)?

ADD REPLYlink written 9.2 years ago by Jeremy Leipzig18k

Hi @paulN For this question, as well as for any next ones, it would be good to detail things a bit more. What is your starting material? What are you trying to accomplish? What are some limitations, assumptions? You will thus get much better answers and they will be more usefull to others too. Cheers!

ADD REPLYlink written 9.2 years ago by Eric Normandeau10k
5
gravatar for Dstan
9.2 years ago by
Dstan160
Provo, Utah, USA
Dstan160 wrote:

We used the Newbler assembler (454/Roche) to assemble transcriptome reads from several different platforms (454, Sanger, and Illumina). Reads from these platforms are of different lengths.

Is it a requirement that the reads be in a single input file? The Newbler assembler will accept multiple input files.

ADD COMMENTlink written 9.2 years ago by Dstan160
2
gravatar for Daniel Swan
9.2 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

Velvet also accepts variable length reads - but they need to be tagged as with the read type (long/short) at assemble time (and thus in different files). Why so keen to have the reads in the same file?

ADD COMMENTlink written 9.2 years ago by Daniel Swan13k

I was talking to Paul the other day. He has some Illumina reads that look like they still have adapter sequence in them. I think he wants to trim out the adapter sequences from the fastq file which would leave some reads shorter than others. This is why reads of different length are in the same file (at the moment)

ADD REPLYlink written 9.2 years ago by Rob Syme540
2
gravatar for Eric Normandeau
9.2 years ago by
Quebec, Canada
Eric Normandeau10k wrote:

mira3 will also permit the assembly of different length fragments. Depending on what you need exactly, they may have to be in different files. (ex: 454 vs sanger)

ADD COMMENTlink written 9.2 years ago by Eric Normandeau10k
0
gravatar for Ryan Thompson
8.0 years ago by
Ryan Thompson3.4k
TSRI, La Jolla, CA
Ryan Thompson3.4k wrote:

You may have to pool your sequences into separate files based on length. Let's say you started with 100-bp reads, and you trimmed variable overlaps of an adapter sequence off of them. You could put all the full-length ones in one file, all the 90-99bp ones in another file, 80-89bp ones in another, and so on down the the smallest you can use (30bp or so?). Then you can use an assembler that accepts different sizes in different files.

ADD COMMENTlink written 8.0 years ago by Ryan Thompson3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1020 users visited in the last hour