Question

Assembly of transcripts of specific genes from the RNA-seq data

1

Entering edit mode

8.7 years ago

vjanousk ▴ 10

Hi,

I have a once produced RNA-seq data of many individuals (~ hundreds) with no reference genome and I am interested only in a subset of transcripts (i.e. transcripts of only a few specific genes). So given the number of individuals the assembly of the whole transcriptomes would be very time consuming. I have been thinking that maybe a subsetting the reads specific for the genes of interests before the actual assembly would be an option. I have been thinking that I could use orthologous sequences of the specific genes of closely related species and for instance bowtie aligner to obtain reads for these genes and then assemble de novo transcripts based on the subset of reads. Do you think this is a good approach? Is there any other approach how to select a subset of reads based on sequence similarity? I appreciate any suggestion. Thanks.

RNA-Seq Assembly • 3.3k views

ADD COMMENT • link updated 18 months ago by Ram 43k • written 8.7 years ago by vjanousk ▴ 10

score 1 · Answer 1 · 2015-09-21

1

Entering edit mode

8.6 years ago

Lior Pachter ▴ 700

kallisto is not an assembler, and therefore not directly relevant in this setting.

ADD COMMENT • link 8.6 years ago by Lior Pachter ▴ 700

score 0 · Answer 2 · 2015-08-14

0

Entering edit mode

8.7 years ago

h.mon 35k

Your approach sounds ok, but either set bowtie to allow a number of mismatches, or use a mapper which allows more divergent reads to align (e.g. Anfo or maybe BBMap). Then assemble, and repeat the alignment step, but using your newly assembled transcripts as reference. You may have to repeat this a number of rounds.

ADD COMMENT • link 8.7 years ago by h.mon 35k

0

Entering edit mode

Thanks! It helps.

ADD REPLY • link 8.7 years ago by vjanousk ▴ 10

Ram · Answer 3 · 2015-08-14

0

Entering edit mode

8.7 years ago

tyler.weirick ▴ 120

Why not just use one of the faster pseudo-alignment assemblers? I have heard good things about Kallisto. Supposedly it is like 1000 times faster than a Tophat+Cufflinks assembly. I think there are a number of other fast assemblers like this: https://en.wikipedia.org/wiki/List_of_RNA-Seq_bioinformatics_tools

ADD COMMENT • link updated 18 months ago by Ram 43k • written 8.7 years ago by tyler.weirick ▴ 120

0

Entering edit mode

Thanks, Tyler. That sounds like a good option. I'll try that. V.

ADD REPLY • link 8.7 years ago by vjanousk ▴ 10