Hello, erveyone, I write a c library named seqio for writing and reading sequence in fasta and fastq format. I have make it open on Github https://github.com/dwpeng/seqio/ . If you have any needs for writing/reading fasta/fastq file, I strongly recommand you have a try.
#include "seqio.h"
#include <stdio.h>
int
main(int argc, char* argv[])
{
if (argc == 1) {
fprintf(stderr, "Usage: %s <in.fasta>\n", argv[0]);
return 1;
}
// Step1: set open options
seqioOpenOptions openOptions = {
.filename = argv[1],
};
// Step2: open file
seqioFile* sf = seqioOpen(&openOptions);
// step3: read records
seqioRecord* record = NULL;
// step4: read records one by one
while ((record = seqioRead(sf, record)) != NULL) {
// setp5: do something with the record
printf("name: %s: %lu\n", record->name->data, record->sequence->length);
// !!! Do not free record,
// !!! beacuse it will be freed by seqioRead automatically.
}
// step6: close file
seqioClose(sf);
}
You only need to include the header file to use it, without the need for macro definitions, as shown below
2. No need to manually release memory
SeqioRead will automatically release memory, so you don't need to manually release memory to avoid memory leaks.
3. Intuitive API
Using it is like using a regular file, only opening, reading, and closing are required, without the need for internal implementation of relationships.
/**
* @brief open a file
* @param options open options
* @return seqioFile* file
*/
seqioFile* seqioOpen(seqioOpenOptions* options);
/**
* @brief read a record
* @param file
* @param record
* @return seqioRecord* record or NULL if the file is end
*/
seqioRecord* seqioRead(seqioFile* file, seqioRecord* record);
/**
* @brief write a fasta record
* @param file
* @param record
* @param options
*/
void seqioWriteFasta(seqioFile* sf, seqioRecord* record, seqioWriteOptions* options);
/**
* @brief close a file
* @param file
*/
void seqioClose(seqioFile* file);
4. Support read and write of compressed files
Support reading and writing compressed files, you don't need to worry about whether the file is compressed or uncompressed, just set the parameters.
typedef struct {
char* filename; // filename
bool isGzipped; // it will be detected automatically if mode is seqOpenModeRead
seqOpenMode mode; // default is seqOpenModeRead
} seqioOpenOptions;
typedef struct {
size_t lineWidth; // fasta file line width (default: 0, no wrap)
bool includeComment; // include comment in fasta record (default: true)
baseCase baseCase; // base case (default: original)
} seqioWriteOptions;
Do you want to say a little bit about _why_ they should try out this library, rather than something like kseq, kseqpp, etc.?
Here are a few reasons why I recommend using it:
1. No macro definition required
You only need to include the header file to use it, without the need for macro definitions, as shown below
2. No need to manually release memory
SeqioRead will automatically release memory, so you don't need to manually release memory to avoid memory leaks.
3. Intuitive API
Using it is like using a regular file, only opening, reading, and closing are required, without the need for internal implementation of relationships.
4. Support read and write of compressed files
Support reading and writing compressed files, you don't need to worry about whether the file is compressed or uncompressed, just set the parameters.
5. Both C and C++ can use it
I would recommend you provide Python binding.
Great suggestion, I will try to implement it.
Hi, I have implemented this python library.