Question: Using Paired End And Orphaned Singles For De Novo Assembly
gravatar for malherbologist
7.0 years ago by
malherbologist0 wrote:

I have been using FastX to process reads prior to de novo assembly and mapping. What I have discovered and few have pointed out is the FastX will delete reads leaving reads unpaired which changes the order of the separate paired fastq files. While it is difficult to know if this is affecting assembly with Trinity, it is definitely a problem for assembly with Velvet/Oases and mapping with Bowtie or BWA. Because the order of the paired reads has changed due to deletions of low quality reads, the reads are no longer order properly and will not map as paired.

There are some work arounds provided by and others to separate the reads that are still paired and place the orphaned reads in a separate file. But here is the problem, I would like to use paired reads in combination with single reads for de novo assembly. In Trinity, one designates as --right -- left or --singles, but you cannot do both.

Question: Can any assembler use both paired and single reads at the same time for de novo assembly?

Q2: Has anyone else run into this problem? Here is a related post:

Q3: This issue is going to eliminate FastX from my pipeline of assembly and mapping. It seems like this should be a bigger issue but there is fairly little out there about this. Am I doing something wrong with FastX that is causing this problem?


paired-end fastx • 5.2k views
ADD COMMENTlink modified 7.0 years ago by Chris Fields2.1k • written 7.0 years ago by malherbologist0

Hi, have you sorted that out? Im using trinity and i'm struggling with the same problem: I want to use my single data (merged paired end reads) and left-right (unpaired reads) together for a big assembly that I want to use as a reference, if not I loose a lot of data.

ADD REPLYlink written 2.7 years ago by steph_tf10
gravatar for Chris Fields
7.0 years ago by
Chris Fields2.1k
University of Illinois Urbana-Champaign
Chris Fields2.1k wrote:

According to the docs and Trinity mail list (via Brian Haas) you can mix single and paired-end data for Trinity:

I personally tend to leave these out if they do not make up a significant portion of the data (or if you have tons of reads, >100M), as I have found they make little difference with the actual assembly.

ADD COMMENTlink written 7.0 years ago by Chris Fields2.1k
gravatar for Vivek
7.0 years ago by
Vivek2.4k wrote:

SoapDenovo allows you to use paired and single end reads at the same time for denovo assembly but you have to specify them in separate files when creating the config file.

I'm not totally sure if the order of reads is important within the files, since I haven't worked on denovo assembly for a while but I'd think any error correction tool that discards low quality reads should output the resulting singletons into a different file and keep the high quality pairs in the same order.

ADD COMMENTlink modified 7.0 years ago • written 7.0 years ago by Vivek2.4k

Note that the user is running a transcriptome assembly (Trinity), not a genome assembly. There is a SOAPdenovo transcriptome assembler for this purpose, though:

ADD REPLYlink written 7.0 years ago by Chris Fields2.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 748 users visited in the last hour