Question

Tool:nanotimeparse - a bash solution for parsing oxford nanopore fastq on generation time

1

Entering edit mode

4.5 years ago

raplayer ▴ 10

As the title suggests, nanotimeparse is a new bash tool (only external dependency is GNU parallel) that parses an Oxford Nanopore fastq file on read generation times.

You can git clone it here: https://github.com/raplayer/nanotimeparse

Get subsets of (-i) Oxford Nanopore Technologies (ONT) basecalled fastq reads in slices of (-s) minutes, over a period of (-p) minutes. Input fastq file is output as 2 sets of n fasta files (n = p/s). Set 1 is n fasta files, and each file contains reads generated from the start of the ONT run to each time slice. Set 2 is also n fasta files, but each file contains only newly generated reads between each time slice.

As for runtime, executing with 10 threads (@2.1GHz), and slicing on every hour over a 48 hour period, nanotimeparse takes about 30 minutes to parse 1.2M reads (~15GB) into both sets (and memory maxes out at (15GB/2)*10~=75GB, see 'Memory Considerations' in README). You can reduce memory consumption by reducing number of threads.

fastq long-reads oxford-nanopore • 1.0k views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 4.5 years ago by raplayer ▴ 10