Question: Bowtie Assembly Parameters
0
gravatar for biolab
5.3 years ago by
biolab1.1k
biolab1.1k wrote:

Hi, everyone, I am trying Bowtie for reference based assembly. My draft genome dataset is not from a model organism, but a related species. This data contains ~20 million 88 bp illumina reads. Because it's not resequencing of reference genome, I need to use a relatively loose parameters. Actually I only focus on genes without caring about intergenic regions. A CDS dataset of model organism can be used as a reference in this assembly. What Bowtie parameters you suggest to use? Thank you very much.

assembly • 2.0k views
ADD COMMENTlink modified 3.4 years ago by Biostar ♦♦ 20 • written 5.3 years ago by biolab1.1k
1

Bowtie is an aligner, not an assembler. Why don't you just use a splice-aware aligner, like tophat, or just try de novo assembly (e.g., with trinity)?

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by Devon Ryan88k

Can you actually use trinity for genome assembly? I thought it's specifically for transcript assembly.

ADD REPLYlink written 5.3 years ago by r.follador60

Since biolab mentioned aligning to CDS, I'm assuming that he/she is actually doing RNAseq (though, upon rereading, this may well be incorrect!). Otherwise, yes, I think you're correct.

ADD REPLYlink written 5.3 years ago by Devon Ryan88k
1
gravatar for Adrian Pelin
5.3 years ago by
Adrian Pelin2.2k
Canada
Adrian Pelin2.2k wrote:

I concur with the first comment, specifically I suggest an assembler. Your data is 88bp illumina, is it paired end? That will improve your assembly statistics. Here are my suggestions:

1) Try SPAdes or Velvet, read here about how they differ. 2) You can map your resulting contigs to your close species. This is better, since contigs are much much longer than reads, and you can virtually see which parts of the genome remain homologous.

ADD COMMENTlink written 5.3 years ago by Adrian Pelin2.2k

thanks for ur suggestions.

ADD REPLYlink written 5.3 years ago by biolab1.1k

It depends on how similar your reference is. bwa is quite good for losely mapping, or "last" given you have enough computational resource. Then just parse the outputs incorporating snps/indels.

ADD REPLYlink written 5.3 years ago by c.v.oflynn90

thanks for your suggestions, all helpful!

ADD REPLYlink written 5.3 years ago by biolab1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 764 users visited in the last hour