I would like to reduce contig number of the bacterial draft genome I'm trying to close. The draft genome is in 47 contigs and has been assembled by me, using Phred, Phrap, Consed.
I'm thinking in using the other draft genome that has been published (same bacterial specie) perform this task.
Do you think I should start from zero, that is, join the not assembled raw data of the two drafts published and see what happens, or use a tool having as input the two assembled drafts?
Thanks for you help, Bernardo
If you already have a good reference, why not directly use reference based assembly? If you want to identify variations, it might not be a good idea to assembly them together or merge them. It can also be difficult to say which sample your data comes from.
Do you know any good tool to do reference based assembly?
Sorry for late reply. Here's the pipeline I usually use.
samtools mpileup -uf <Fasta of Reference> <Alignment Results>.bam | bcftools view -cg - | vcfutils.pl vcf2fq > <Assembled Results>.cons.fq
It is unlikely that the other draft genome has any information you can use for your draft genome. I would reckon the assembly will be broken at the same places ie. large repeats such as rRNA locii and insertion sequences that your read length can not span.