Question: Which C++ Libraries Are Best For Dealing With Fastq Files?
9
gravatar for Jeremy Leipzig
10.9 years ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

I would like to rewrite some perl scripts into something faster. I haven't written C++ since the Clinton administration. Granted I am not married to C++ per se but I would need something that benchmarks well.

Which C++ libraries are people using to deal with NGS data?

fastq next-gen C sequencing • 8.2k views
ADD COMMENTlink modified 2.6 years ago by cartoonist80 • written 10.9 years ago by Jeremy Leipzig19k

Which OS/CPU architecture do you need it for?

ADD REPLYlink written 10.9 years ago by Phis1.1k

RHEL5 x86_64 ..

ADD REPLYlink written 10.9 years ago by Jeremy Leipzig19k
10
gravatar for Michael Dondrup
10.9 years ago by
Bergen, Norway
Michael Dondrup48k wrote:

I saw a SeqAn poster at ISMB last year. No experience with the library (nor C++) myself but they support the fastq format and they made the impression that they are quite competent.

SeqAn file formats

ADD COMMENTlink written 10.9 years ago by Michael Dondrup48k
8
gravatar for User 59
10.9 years ago by
User 5913k
User 5913k wrote:

I wouldn't dream of doing this I admit, I tend to handle fastq files with applications other people develop.

However there is a FASTA/FASTQ c++ parser here:

http://lh3lh3.users.sourceforge.net/parsefastq.shtml which might serve as a base for what you want to do.

It's from Heng Li who also works on SAMtools, BWA and MAQ

ADD COMMENTlink written 10.9 years ago by User 5913k
2

bouuhhhh in http://lh3lh3.users.sourceforge.net/kseq.shtml ANY malloc should be checked against NULL (line 56 , 121 , 188 ...) :-(

ADD REPLYlink written 10.9 years ago by Pierre Lindenbaum134k
1

+1 - Highly recommended. That header supports compressed files, too, which speeds IO-bound processing. One caveat for C++ though - that header wants char and FILE, not C++ strings and iostreams, but that's easy enough to manage.

ADD REPLYlink written 10.9 years ago by Jonathan Manning640
3
gravatar for Manuel
10.4 years ago by
Manuel400
Germany
Manuel400 wrote:

If you work with NGS data and want to try SeqAn as Michael already suggested, have a look at this tutorial for importing read data. Also, their documentation has greatly improved recently.

ADD COMMENTlink written 10.4 years ago by Manuel400
0
gravatar for Luiz Irber
6.3 years ago by
Luiz Irber0
United States
Luiz Irber0 wrote:

Another option is SeqDB

ADD COMMENTlink written 6.3 years ago by Luiz Irber0
0
gravatar for cartoonist
2.6 years ago by
cartoonist80
Germany
cartoonist80 wrote:

Since I could not find any C++ library that meets my requirements, I re-write the kseq library (by @lh3) in C++ using templates, called kseq++. Here, I compared its performance with original kseq and SeqAn:

https://github.com/cartoonist/kseqpp

SeqAn uses its own string class. If one does not use it, converting back to std::string is really expensive (3x slower on my workstation).

ADD COMMENTlink modified 13 months ago • written 2.6 years ago by cartoonist80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1363 users visited in the last hour
_