Question: bwa index not generating, taking long time
0
gravatar for ccha97
27 days ago by
ccha9720
ccha9720 wrote:

Hi there, I'm using a program called Taiji which utilises BWA. Currently it is generating a BWA_index for my ATACSeq data, and I am just wondering how long it is supposed to take? It has been on this for almost 12+ hours, and I only have 6 narrowPeak files (mouse samples) as input. At first I thought it was a memory issue, so I reran the entire thing on a new disk (which still has 142 GB available) but still get the same issue. Any suggestions?

[bwa_index] Pack forward-only FASTA... 18.27 sec

enter image description here

index bwa alignment assembly • 138 views
ADD COMMENTlink modified 27 days ago by Mensur Dlakic6.5k • written 27 days ago by ccha9720

Does the pipeline allow to provide external indices? If so, build them externally first. It is unfortunate that a wrapper tries to do these things all in a single run. If something fails it has to start over from scratch.

ADD REPLYlink written 27 days ago by ATpoint38k
2
gravatar for Istvan Albert
27 days ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

The index should be built on the reference genome not on your data.

ADD COMMENTlink written 27 days ago by Istvan Albert ♦♦ 84k

Thank you for your answer. I didn't know that, but the program (Taiji) is running it automatically so I assume it's being built on the mm10 reference genome (?) as the reference genome was one of the inputs. Is there any other possible explanation as to why it's stuck on this specific line?

ADD REPLYlink modified 27 days ago • written 27 days ago by ccha9720
1
gravatar for Mensur Dlakic
27 days ago by
Mensur Dlakic6.5k
USA
Mensur Dlakic6.5k wrote:

Is there any other possible explanation as to why it's stuck on this specific line?

It has completed that line, so it is stuck on what comes next. Here are all bwa index lines when indexing a small file:

[bwa_index] Pack FASTA... 0.01 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=3718406, availableWord=4724070
[bwt_gen] Finished constructing BWT in 5 iterations.
[bwa_index] 1.11 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.01 sec
[bwa_index] Construct SA from BWT and Occ... 0.20 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index -a bwtsw group_00.fa
[main] Real time: 1.638 sec; CPU: 1.342 sec

Since the first bwa index line in my file took about 1s (total time was 1.6s) and yours was >3500, by extrapolation the whole process in your case should take about 1.5 hours. This is assuming that indexing time is linear - don't know if that's true - and that you have enough RAM to index this file in memory - don't know if that's true either. If your computer is short on memory, it may be swapping which can take very long time.

I suggest you find out how much your computer is swapping during this indexing operation:

swapon -s

This is what my computer shows at the moment:

Filename                                Type            Size    Used    Priority
/dev/sda1                               partition       97654780        280576  -2

As you can see, swap disk utilization here is less than 1% and I suspect yours will be much higher. Or try free -m which will show both RAM and swap usage:

              total        used        free      shared  buff/cache   available
Mem:         257873      139270       54222           8       64380      116872
Swap:         95365         274       95091
ADD COMMENTlink modified 27 days ago • written 27 days ago by Mensur Dlakic6.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1740 users visited in the last hour