I would like to announce a set of Python scripts and modules I have written for analysis and processing of long read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. They can be found on GitHub and can be installed using pip and conda.
Collectively they're called NanoPack, which can also be used to install all of the scripts simultaneously.
NanoComp: comparing multiple runs on read length and quality based on reads (fastq), alignments (bam) or albacore summary files.
NanoQC: Generating plots to investigate nucleotide composition and quality distribution at the end of reads.
NanoFilt: Streaming script for filtering a fastq file based on a minimum length and minimum quality cut-off. Also trimming nucleotides from either read ends is an option.
NanoStat: Quickly create a statistical summary from reads, an alignment or a summary file
nanoget: Functions for extracting features from reads, alignments and albacore summary data.
nanomath: Functions for mathematical processing and calculating statistics
nanoplotter: Appropriate plotting functions, heavily using the seaborn module and for some plots also plotly and bokeh
I welcome all feedback, bug reports, suggestions and feature requests!
This set of scripts has been published in Bioinformatics