Question: De Novo Assembly Errors
3.9 years ago
United States
Recently, my lab had a private vendor has assembled transcriptome contigs of several genes from a mammal. I do not have access to the raw files - only assembled contigs for genes.  

I discovered on contig has a 100bp insert, right in the middle of the gene. I did a BLAST search of this 100bp insert (the rest of the gene aligns to pre-existing homologs in other species), but it returned nothing. I suspect it is a sequencing artifact. 

Is there a way I can be sure that this insert is an artifact? Could it be the product of retrotransposition? What is the usual procedure for this kind of problem?

3.9 years ago
Devon Ryan
Freiburg, Germany
Normally you'd need to map the original reads back to the contig and then have a look at the results. If you don't have the original sequencing (you should be able to ask the vendor for that), however, then that'd be impossible. At that point all you can do is a few wet lab experiments (e.g., PCR the region around the insert and check the size/sequence).

