A Reference Assembler That Uses Multiple Reference Sequences?
2
3
Entering edit mode
11.9 years ago
diltsjeri ▴ 470

Is there a reference assembler (an assembler that uses a reference genome to base the assembly on) that will take multiple reference sequences? I have multiple contigs that I want to uses as a references for reads that I'm trying to assemble. I don't have a full genome, we just amplicated regions we are interested in. I would like to use a reference assembler, but if the software takes only one reference genome at a time, then I would have to run it against the reads for as many contigs as I have available. If that is the case, it may be better to just one de novo assembly. My question is.... Is there a reference assembler that will take multiple reference sequences?

reference multiple • 4.4k views
ADD COMMENT
1
Entering edit mode
11.9 years ago
Michael 54k

Practically all alignment tools ( fasta, bwa, blast, bowtie, blat, etc.) should be able to do this just fine, because most reference genomes consist of multiple chromosomes, scaffolds or contigs. Just put your reference contigs (which are the result of the de-novo assembly) into a multiple fasta file and use any.

Note: In my understanding the term "assembly" is a misnomer in this context as it should be reserved for de-novo assembly, it should be called alignment or maybe mapping to a reference sequence; but I have seen it quite often recently. Whatever drives people to call alignment assembly...

ADD COMMENT
0
Entering edit mode

I think the original author is asking for an assembler that makes use of a reference genome to perform the scaffolding of the contigs.

ADD REPLY
0
Entering edit mode

Yes! I'm asking for an assembler that uses multiple reference genomes to assemble reads into contigs.

ADD REPLY
0
Entering edit mode

Oh, you mean a comparative genome assembler like in http://www.cbcb.umd.edu/papers/Pop_et_al_Comparative.pdf

ADD REPLY
0
Entering edit mode

Correct. I was going to use AMOScmp, but I'm not sure if it can take multiple reference genomes at a single time.

ADD REPLY
0
Entering edit mode

I have no practical experience with AMOScmp, but I guess it wouldn't hurt to make a multi fasta with all contigs and try, but if I understood correctly all reference sequences are from only one genome, they are just multiple contigs, correct? If so I guess it could work, but I'd be careful about unexpected side-effects, and I am not sure if the use of this tools make much sense then.

ADD REPLY
0
Entering edit mode

The reads sequenced are only from those regions as well, not reads from an entire genome (We are looking at same genes of different patients). So, it could work. Maybe. I'm going to give it a shot and probably write the developer. Thanks for your help.

ADD REPLY
0
Entering edit mode
11.9 years ago

I don't know if you finally succeeded in doing what you wanted to... but I used the program VELVET to perform a de novo assembly with Illumina paired-end reads from multiple tagged individuals in two different populations for a non-model species (i.e no reference genome). I had previous assembly information from past projects. I had long contigs from a previous de novo assemby and 454 reads as references for my assembly.

The program takes any number of fastq files from any sequencing platform and performs a de novo assembly either with no reference at all or with short or long reads references. The program will use the information contained in the "reference files" to create a better assembly. It works fine, but it's memory and time consuming. The more reference files you add, the more time and memory it will take, depending on the amount of data you want to assemble. It's a good program, but be sure you have enough CPU/Memory power to use it properly.

http://www.ebi.ac.uk/~zerbino/velvet/

Best regards, FO

ADD COMMENT

Login before adding your answer.

Traffic: 2689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6