Question: Very Few Reads Mapping Back To Contigs - Plant Transcriptome
gravatar for Cerebralrust
7.6 years ago by
Cerebralrust20 wrote:

I assembled plant transcriptome 454 data (non normalised) using trinity after the following

1)pre processing (removal of adaptors, vector contamination) 2)removal of rRna sequences 3)removal of chloroplast and mitochondrial genes using bwa

From 3,70,929 reads, i got 21,486 contigs. When i mapped the reads to the contigs using bwa, only 44,678 reads were used in the assembly. What am i doing wrong here? I randomly blasted the contigs to observe that they share over 90% similarity with related legume proteins (although many were hypothetical). However, only a small percentage of the contigs align to the transcript assemblies of related legumes when mapped using bwa.

The velvet assembly of the same data resulted in 15,323 contigs with lesser n50 value, n90 value, max length etc. MIRA assembly resulted in more contigs and more reads being used but lesser n50, n90 and avg length of contig. Why are only 44,678 reads being used? Any advice is greatly appreciated.

plant rna mapping bwa read • 2.6k views
ADD COMMENTlink written 7.6 years ago by Cerebralrust20

Do you mean 370k reads or 3 million? That would have a big impact on interpreting your read usage. Also, I agree with (22308)3 that Newbler would be a good tool of choice for your data.

ADD REPLYlink written 7.6 years ago by SES8.2k
gravatar for Rt
7.6 years ago by
Rt80 wrote:

According to one of key developers of Trinity - Brian J. Haas' option:

"Ultimately, Trinity might not be the best tool for assembling 454 data, since coverage won't be anywhere near what is expected from Illumina in most cases, and Trinity exploits the high coverage data as part of reconstructing transcripts. The current version of Newbler is supposed to work especially well for 454 transcriptome data, so I encourage you to give that a try if you haven't already."

ADD COMMENTlink modified 22 days ago by RamRS24k • written 7.6 years ago by Rt80
gravatar for 2184687-1231-83-
7.6 years ago by
2184687-1231-83-5.0k wrote:

I would try Newbler 2.6 if you have access to it. Use bwasw to map 454 reads to contigs.

ADD COMMENTlink written 7.6 years ago by 2184687-1231-83-5.0k
gravatar for Cerebralrust
7.6 years ago by
Cerebralrust0 wrote:

I did try Newbler. However, Newbler generated only 9494 isotigs out of 2,50,000 reads. Although, the N50 value, size of contigs and other metrics are quite positive. I am going to BLASTx the entire set of contigs from the three assemblers to the proteomes of related species and the NR databases; allowing the results to determine the best assembly. Any other strategy is hugely appreciated.

ADD COMMENTlink written 7.6 years ago by Cerebralrust0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1643 users visited in the last hour