Error: [bwa_idx_load_from_disk] fail to locate index
1
0
Entering edit mode
4.8 years ago

Hey all, I've been working on Bash script to map some genes to a reference sequence. I'm using BWA for that, and I keep getting an error message that BWA has failed to locate the index.

Here's my code below:

#!/bin/bash
bwa index $1
samtools faidx $1
bwa mem -t $1 *fastq.gz > aln_se.sam
samtools view -bt *fasta.fai aln_se.sam > aln_se.bam
samtools sort aln_se.bam -o aln.sorted.bam
samtools index aln.sorted.bam
samtools flagstat aln.sorted.bam

I've already indexed the fasta file containing my sequence. And during the first few times I've tried this, all the indexes generated by BWA index and samtools faidx were inside the same directory /mnt/c/Users/#name/Documents/Project/Data.

Based on what I was reading on other bioinformatics forums, some have said moving their indexes to another directory has worked for them. Unfortunately it hasn't worked for me.

If anybody knows what's up with this error, or what is wrong with my code, I'd love to know. And do feel free to tell me if I can add more information to help clarify. Thanks for your help!

software error BWA Bash • 3.7k views
ADD COMMENT
1
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

Thanks for doing so. This was my first time posting here let alone posting my own code, so I'll be aware of that the next time.

ADD REPLY
0
Entering edit mode

At what step do you get the error? Have you tried running the script line by line to see what happens? Add set -eo pipefail to the top of a script to ensure that it quits when a line fails.

ADD REPLY
1
Entering edit mode
4.8 years ago

It seems that you just need to change your bwa mem command. The -t parameter is used to specify number of threads, but you have not specified anything. So, BWA will try to interpret your FASTA reference ($1) as the number of threads (which makes no sense).

You could just try:

bwa mem -t 4 $1 *fastq.gz > aln_se.sam

I will assume that you are comfortable using an asterisk in a command like this. I have never run a command like this, just for the record.

Kevin

ADD COMMENT
1
Entering edit mode

Agreed. It is recommended to use Unix pipes to avoid unnecessary intermediate files:

bwa mem (options...) | samtools sort -o sorted.bam

There is also no need to index the reads you align, only the reference genome needs indexing by bwa itself.

ADD REPLY

Login before adding your answer.

Traffic: 2564 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6