Question: De novo assembly around 1 short length (<100bp) known sequence.
gravatar for linzdm1187
5.2 years ago by
United States
linzdm11870 wrote:

Hello all,

In an effort to avoid RACE-pcr my group is trying to use some sequencing data we have available to discover the sequence of a single gene in a crustacean. We know the sequence of a short 85bp region of this gene. Using the sequencing data we have (~60 million 50bp single end Illumina reads) I performed a standard de novo assembly using Trinity, but as expected (given the single end reads) the assembly was poor, and no contigs matched our gene of interest. I am curious if there is any way to use this known sequence as a partial guide for assembly in attempt to resolve more of the sequence around this region (using Trinity or any other assembler).


rna-seq assembly • 1.7k views
ADD COMMENTlink modified 5.2 years ago by Biostar ♦♦ 20 • written 5.2 years ago by linzdm11870
gravatar for thackl
5.2 years ago by
thackl2.8k wrote:

Have a look at TASR and Mapsambler. Both are assembly programs that use seed sequences as starting points for assembly.

ADD COMMENTlink written 5.2 years ago by thackl2.8k

I've spent a lot of time (trying) to do this sort of thing with what's out there, and I have to say, none of the programs are necessarily great.  Better than these two - TASR is slowwwww / never had particularly good results with the original mapsembler myself (haven't tried v2) - the best I've found has been PRICE (  PRICE loops the entire set of input files for every iterative extension, though, so depending on the amount of data you have, it can also take a good while!

ADD REPLYlink written 5.2 years ago by george.ry1.1k
gravatar for Antonio R. Franco
5.2 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.5k wrote:

At least in my hands, and starting with poor data, it is a bad idea going to get a transcriptome using trinity for the goal you want.

Try using an old-fashioned assembler such CAP3 that you can use from web servers such as EGASSEMBLER . It will no try to assemble the whole transcriptome, but will try to get overlapping sequences to classify them as contigs or singletons. With some luck, maybe you will be ending with one of such as useful contigs

ADD COMMENTlink written 5.2 years ago by Antonio R. Franco4.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour