Oases pipeline fails on second kmer
1
0
Entering edit mode
8.3 years ago
katie.duryea ▴ 40

Hello all,

I am hoping for help running the oases_pipeline.py. I am able to run the pipeline for a subset of my data and for 4 Kmer values using the following code:

python oases_pipeline.py -m 21 -M 27 -s 2 -o oases_test -d '-fastq -shortPaired -separate trimm_15_F_paired.fq trimm_15_R_paired.fq' -p '-ins_length 160'

This runs successfully and produces output for K 21 through 27. However, when I try to run the following code on my full dataset (538769828 sequences) and for a larger range of K, it fails. This is my input code:

python oases_pipeline.py -m 21 -M 51 -s 2 -o oases_ALL -d '-fastq -shortPaired -separate ALL_F_paired.fq ALL_R_paired.fq' -p '-ins_length 160'

This command runs successfully for K=21, but then crashes on K=23 with this output:

[5141.379366] Inputting sequence 66000000 / 538769828
[5163.243577] Inputting sequence 67000000 / 538769828
[5170.824776]  === Sequences loaded in 997.337692 s
[5171.829179] Done inputting sequences
[5171.829187] Destroying splay table
[5173.870477] Splay table destroyed
[5175.177294] Command failed!
[5175.177304] rm -f oases_ALL_23/Sequences
Hash failed

I am at a loss for why it will run for a subset of data and for the first K, but crash on the second.

Many thanks in advance for any input!

RNA-Seq • 1.4k views
ADD COMMENT
0
Entering edit mode
8.3 years ago
katie.duryea ▴ 40

UPDATE: I got this run by changing the step size (s) to 4. I'm not sure why that worked, so I will leave this up here in case someone else encounters this problem or has input on what is happening.

ADD COMMENT

Login before adding your answer.

Traffic: 1316 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6