Question: Which C++ Libraries Are Best For Dealing With Fastq Files?
9
gravatar for Jeremy Leipzig
10.2 years ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

I would like to rewrite some perl scripts into something faster. I haven't written C++ since the Clinton administration. Granted I am not married to C++ per se but I would need something that benchmarks well.

Which C++ libraries are people using to deal with NGS data?

fastq next-gen C sequencing • 7.6k views
ADD COMMENTlink modified 22 months ago by cartoonist60 • written 10.2 years ago by Jeremy Leipzig19k

Which OS/CPU architecture do you need it for?

ADD REPLYlink written 10.2 years ago by Phis1.0k

RHEL5 x86_64 ..

ADD REPLYlink written 10.2 years ago by Jeremy Leipzig19k
10
gravatar for Michael Dondrup
10.2 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

I saw a SeqAn poster at ISMB last year. No experience with the library (nor C++) myself but they support the fastq format and they made the impression that they are quite competent.

SeqAn file formats

ADD COMMENTlink written 10.2 years ago by Michael Dondrup47k
8
gravatar for Daniel Swan
10.2 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

I wouldn't dream of doing this I admit, I tend to handle fastq files with applications other people develop.

However there is a FASTA/FASTQ c++ parser here:

http://lh3lh3.users.sourceforge.net/parsefastq.shtml which might serve as a base for what you want to do.

It's from Heng Li who also works on SAMtools, BWA and MAQ

ADD COMMENTlink written 10.2 years ago by Daniel Swan13k
2

bouuhhhh in http://lh3lh3.users.sourceforge.net/kseq.shtml ANY malloc should be checked against NULL (line 56 , 121 , 188 ...) :-(

ADD REPLYlink written 10.2 years ago by Pierre Lindenbaum128k
1

+1 - Highly recommended. That header supports compressed files, too, which speeds IO-bound processing. One caveat for C++ though - that header wants char and FILE, not C++ strings and iostreams, but that's easy enough to manage.

ADD REPLYlink written 10.2 years ago by Jonathan Manning640
3
gravatar for Manuel
9.7 years ago by
Manuel390
Germany
Manuel390 wrote:

If you work with NGS data and want to try SeqAn as Michael already suggested, have a look at this tutorial for importing read data. Also, their documentation has greatly improved recently.

ADD COMMENTlink written 9.7 years ago by Manuel390
0
gravatar for Luiz Irber
5.6 years ago by
Luiz Irber0
United States
Luiz Irber0 wrote:

Another option is SeqDB

ADD COMMENTlink written 5.6 years ago by Luiz Irber0
0
gravatar for cartoonist
22 months ago by
cartoonist60
Germany
cartoonist60 wrote:

Since I could not find any C++ library that meets my requirements, I re-write the kseq library (by @lh3) in C++ using templates, called kseq++. Here, I compared its performance with original kseq and SeqAn:

https://github.com/cartoonist/kseqpp

SeqAn uses its own string class. If one does not use it, converting back to std::string is really expensive (3x slower on my workstation).

ADD COMMENTlink modified 4 months ago • written 22 months ago by cartoonist60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 845 users visited in the last hour