Question: Gmap aligner taking too long
0
gravatar for KVC_bioinfo
16 months ago by
KVC_bioinfo350
Boston
KVC_bioinfo350 wrote:

Hello all,

I am running GMAP on nanopore sequence aligning it to the human genome. The query file is approximately 2gb. The alignment is taking extremely long. It's been more than 12 hours and the output .sam file is so far only 130mb.

I am using following command:

path/to/gmapl -D /path/to/dir/ONT -d ONT /path/to/sample/fasta -t 4 -n 0 -f samse > /path/to/output/output.sam

Am I missing anything here?

Could someone please help me. Thanks

aligner gmap • 736 views
ADD COMMENTlink written 16 months ago by KVC_bioinfo350
2

Asking why X program is taking too long has no good answers as long as the program is still running/producing output. Since you are likely running this for the first time you have no reference to compare (past results) either. It could be your data, hardware and/or any options that you may be missing/misusing (mmap case).

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax64k

I was trying to understand if there is something in my command that is missing or wrong that's why it might take longer. I thought someone who has already used it might be able to recognize it. about mmap: I did not understand what that is but seems like it might slow the process.

Therefore, all I am trying here is to understand if I am missing anything. Thank you

ADD REPLYlink written 16 months ago by KVC_bioinfo350
1

Have you tried minimap2? Fast and can do spliced alignment.

ADD REPLYlink written 16 months ago by WouterDeCoster37k

Not yet. I need gmap results because this is part of the comparison for aligners.

ADD REPLYlink written 16 months ago by KVC_bioinfo350

Can anyone who has used GMAP aligner before, recognize anything wrong in my command?

I am still struggling with it.

ADD REPLYlink written 16 months ago by KVC_bioinfo350
2

Have you tried just splitting up your input file of reads into 20, 50, 100 subsets and submitting 20, 50, 100 jobs to a cluster?

ADD REPLYlink written 16 months ago by Philipp Bayer6.0k

I did not try that. But I made a file of with subset of original with few thousand reads. to check if it works which is also taking forever.

ADD REPLYlink modified 16 months ago • written 16 months ago by KVC_bioinfo350

But I will try doing the way you suggested.

ADD REPLYlink written 16 months ago by KVC_bioinfo350

I tried splitting it, it is still taking extremely long.

ADD REPLYlink written 16 months ago by KVC_bioinfo350

Hello, I tried doing that It still took forever. I had to kill the job. I assume GMAP does not work well with 1D reads.

ADD REPLYlink written 16 months ago by KVC_bioinfo350

@WouterDeCoste: I am currently trying minimap. Thank you for the suggestion.

ADD REPLYlink modified 16 months ago • written 16 months ago by KVC_bioinfo350

I just came acorss this from the manual. I not sure what mmap and allocate is? it mentiones ""If mmap not available and allocate not chosen, then will use fileio (very slow)"" Is the case happening here? Computation options

-B, --batch=INT Batch mode (default = 2)

                             Mode     Offsets       Positions       Genome

                               0      see note      mmap            mmap

                               1      see note      mmap & preload  mmap

                  (default)2      see note      mmap & preload  mmap & preload
                               3      see note      allocate        mmap & preload
                               4      see note      allocate        allocate
                               5      expand        allocate        allocate
                       Note: For a single sequence, all data structures use mmap
                       If mmap not available and allocate not chosen, then will use fileio (very slow)
                   Note about --batch and offsets: Expansion of offsets can be controlled
                   independently by the --expand-offsets flag.  The --batch=5 option is equivalent
                   to --batch=4 plus --expand-offsets=1
ADD REPLYlink modified 16 months ago • written 16 months ago by KVC_bioinfo350
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2126 users visited in the last hour