Is it possible to merge two indexes created using BWA version 0.7.17 (https://github.com/lh3/bwa)? I need to create many BWA index files that included a large genome and a variable smaller bacterial genome. I need to do this many times as part of a pipeline and it takes a about 40 minutes, even using a large value for the -b parameter (i.e. -b 1000000000000). I'm looking for a way to combine the large reference genome index and the small bacterial genome reference since the large reference genome is fixed and only the bacterial genomes are different from run to run.
I've come across several different programs that can merge BWT index files, such as https://github.com/holtjma/msbwt, https://github.com/jltsiren/bwt-merge, and https://github.com/felipelouza/egap. However, these programs do not seem to produce BWT files that are in the format required by the BWA read alignment tool.
When I run the command
bwa index -a bwtsw genome.fasta, I get five different output files: genome.bwt, genome.pac, genome.ann, genome.amb, genome.sa/ . Even if msbwt, bwt-merg and egap (etc) could produce BWT files formatted for BWA, I'm not sure how to merge the other file types (i.e. .pac, .ann, .amb, .sa). Does anyone know how to merge multiple BWA indexes?
I learned that if you have a bwt index in the format required by bwa that you can generate the .pac, .ann, .amb, and .sa files using bwa's bwt2sa and fa2pac commands. For example, if you have a bwt file named genome.fasta.bwt, you can run these commands:
bwa bwt2sa genome.fasta.bwt genome.fasta.sa and
bwa fa2pac genome.fasta.bwt genome.fasta.pac