Tool:nanotimeparse - a bash solution for parsing oxford nanopore fastq on generation time
0
1
Entering edit mode
4.5 years ago
raplayer ▴ 10

As the title suggests, nanotimeparse is a new bash tool (only external dependency is GNU parallel) that parses an Oxford Nanopore fastq file on read generation times.

You can git clone it here: https://github.com/raplayer/nanotimeparse

Get subsets of (-i) Oxford Nanopore Technologies (ONT) basecalled fastq reads in slices of (-s) minutes, over a period of (-p) minutes. Input fastq file is output as 2 sets of n fasta files (n = p/s). Set 1 is n fasta files, and each file contains reads generated from the start of the ONT run to each time slice. Set 2 is also n fasta files, but each file contains only newly generated reads between each time slice.

As for runtime, executing with 10 threads (@2.1GHz), and slicing on every hour over a 48 hour period, nanotimeparse takes about 30 minutes to parse 1.2M reads (~15GB) into both sets (and memory maxes out at (15GB/2)*10~=75GB, see 'Memory Considerations' in README). You can reduce memory consumption by reducing number of threads.

fastq long-reads oxford-nanopore • 1.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6