Question: Transcriptomic Assembly From Different Strains
8.2 years ago by
Ke8630 wrote:

I have the following transcriptomic datasets from a lower eukaryote (no genomic sequence yet):

  • Dataset 1: An "old" assembly from Strain A, obtained from Sanger and 454;
  • Dataset 2: A novel 454 output from Strain B;
  • Dataset 3: Several novel solexa outputs from different cell cycle points, from Strain A;
  • Dataset 4: Another solexa output from Strain C;
  • Dataset 5: And finally several solexa outputs from the wild type strain.

The question here is how to proceed for the assembly;

  • Is it ok to perform the assembly of the datasets 1 and 2 together, even when they belong to different strains?
  • The solexa datasets 3, 4 and 5 should be first assembled de novo, and later together with 1 and 2? And if so, which program should be used to perform the assembly using a transcriptomic sequence as reference?
written 8.2 years ago by Ke8630
8.2 years ago by
IIMCB, Poland
Leszek4.0k wrote:

Could you give some more details: what is expected genome size and coverage, number of genes, ploidy, read length? Is transcriptome stranded, or you don't know from which strand you detect expression? Are the Illumina reads paired-ends?

You might consider to use Velvet coupled with Columbus. Here is some info from Columbus manual:

" Assisted transcriptome assembly
You sequenced the transcriptome of a new species, strain or individual, and you happen to know the gene sequences of a nearby species, strain or reference individual. You would then map the reads onto the reference genome, using the short-read mapper of your choice, and provide the alignments along with the known exonic sequences to Velvet. It would rebuild contigs based on the alignments, which could then be used by the Oases package."

I think it'll fit well for you project.

written 8.2 years ago by Leszek4.0k
8.2 years ago by
Israel Barrantes740 wrote:

Although I haven't tried it yet in this way, you could use Mira, which its manual says it is able to perform de novo hybrid assemblies, and it can also process transcript sequences from different strains.

written 8.2 years ago by Israel Barrantes740
