Aligning Short Reads To Homologous Sequences?
2
4
Entering edit mode
13.4 years ago
Goldbear ▴ 130

I'm working on a de novo transcriptome project and the genome has not been sequenced for our plant. In general, trinity is working pretty well, but we are seeing some spurious fusion genes (N terminus of gene A fused to C terminus of gene B). For the questions we are asking, having the correct full-length protein sequence is important.

I was wondering if we could get around this problem by first mapping short reads to homologous protein sequences from Arabidopsis and then feeding each of those subcollections into a contig builder? Is there any software that acts like blastx but works well on short Illumina reads?

Thanks

transcriptome short aligner • 3.2k views
ADD COMMENT
0
Entering edit mode

Hi, length of the reads and the divergence between your species and Arabidopsis will be very useful to know first.

ADD REPLY
0
Entering edit mode

Hi, my reads are about 90bp. unfortunately the divergence is pretty far. Our strategy may be to fully sequence one representative species from our clade and then use that as a reference for the other species. Thanks!

ADD REPLY
2
Entering edit mode
13.4 years ago
Vitis ★ 2.5k

There is a blastx-based pipeline called STM (a Genome Research paper) which uses de novo contigs and blastx search results to proteins from a closely related species to improve de novo assembly. Maybe it's also not hard to put together a similar pipeline using perl, in which blastx outputs can be dissected more closely. However, I think choosing the right reference taxon should be a very important factor.

ADD COMMENT
0
Entering edit mode

Thanks! This was exactly what I was looking for.

ADD REPLY
0
Entering edit mode
11.0 years ago
rob234king ▴ 610

There is a setting on trinity to attempt to address fusion transcripts, try again but use --jaccard_clip

ADD COMMENT

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6