I have a Burkholderia pseudomallei sequence (Miseq paired-end) and I want to perform comparative genomics against reference genomes (BPk96243, MSHR1435 and others). I already did the spades assembly but it had many nodes (700+). I can use abacas to order/align it against a reference genome but Burkholderia has 2 chromosomes which makes me confused. What should I do so that I can have Fasta files for two chromosomes (Similar to the uploaded ones)? Your help is highly appreciated. Thank you
Since you know it has 2 chromosomes, you could map the reads against each reference chromosome separately and then assemble them individually. Depending on how close your reference genomes are, you may want to do so with quite relaxed alignment parameters.
Any recommendations for that? Also for further analysis, should I use two files (like chr1, chr2)?
Depends on the objective/task. You could have a chromosome in each file, you could have them both in the same file, or you could artificially concatenate them with some NNNs. Depends what you need to do.
For the mapping, you'll just need your favourite aligner (BWA/bowtie2 etc) and samtools.
I'm interested in comparative genomics (Against clinical reference genomes for essential and virulent genes). The BPk96243 genome that I'm interested in has both of its chromosomes (BX571965, BX571966) uploaded as separate entities. So I want to make something like that (I know this won't be complete genome but still).
"Comparative genomics" is not a task in this context, I'm talking about specifics.
If you only have short-read data, it's highly unlikely that you will get a complete, contiguous, closed assembly no matter what you do.
I want to have chromosomes in each file (I do not want to join the chromosomes together).