Question: Minimal example of using htslib from conda to build a simple c file?
4 months ago by
Click downvote670 wrote:

This is a tiny script that is enough to get gcc to fail when trying to build:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <htslib/sam.h>

int main(int argc, char *argv[]){

    samFile *fp_in = hts_open(argv[1],"r"); //open bam file
    bam_hdr_t *bamHdr = sam_hdr_read(fp_in); //read header
    bam1_t *aln = bam_init1(); //initialize an alignment

This is my minimal command for trying to build:

gcc test.c /mnt/work/me/software/anaconda/pkgs/htslib-1.9-hc238db4_4/lib/libhts.a  -I /mnt/work/me/software/anaconda/pkgs/htslib-1.9-hc238db4_4/include/

What happens is:

/mnt/work/me/software/anaconda/pkgs/htslib-1.9-hc238db4_4/lib/libhts.a(hts.o): In function `decompress_peek':
hts.c:(.text+0x14a5): undefined reference to `inflateInit2_'
hts.c:(.text+0x14c6): undefined reference to `inflate'
hts.c:(.text+0x14f9): undefined reference to `inflateEnd'
/mnt/work/me/software/anaconda/pkgs/htslib-1.9-hc238db4_4/lib/libhts.a(cram_io.o): In function `itf8_decode_crc':
cram_io.c:(.text+0x18e0): undefined reference to `libdeflate_crc32'
cram_io.c:(.text+0x1938): undefined reference to `libdeflate_crc32'
cram_io.c:(.text+0x19b6): undefined reference to `libdeflate_crc32'
cram_io.c:(.text+0x1a5a): undefined reference to `libdeflate_crc32'
cram_io.c:(.text+0x1b2d): undefined reference to `libdeflate_crc32'

1) What am I missing to get this script to compile?

2) And how do I find out where htslib is installed on the end users system? I can am using to install the script so I am free to run arbitrary code to find it.

May I ask what the tool ultimately will be doing?

Getting the start, len, strand and reference. It is for a ChIP-Seq caller :)

ADD REPLYlink written 4 months ago by Click downvote670

Is this a thesis project? There are plenty of ChIP-seq callers out there: MACS, FSeq, Homer etc. plus approaches that directly skip this step and use window-based approaches for differential binding analysis, such as csaw. What will your tool make superior above the established tools? It should be able to make good use of replicated and input-controlled experiments. I am asking because tool development takes a long time. MACS, the most prominent peak caller has been actively developed/maintained over many years. Can you comment on these things (I really do not want to demotivate or bash you, but it is important to have good answers to these questions that any reviewer would ask you).

It is a reimplementation of a very popular piece of software. It is already working well, I am just irked that reading bam takes 3x as long as bed reading.

ADD REPLYlink written 4 months ago by Click downvote670

I think you need to have zlib or libdeflate somewhere in your LD_LIBRARY_PATH (and the -lz option)

4 months ago by
John Marshall1.5k
Glasgow, Scotland
John Marshall1.5k wrote:

You are moving from building existing packaged tools, which is (admittedly less than ideally) something that many bioinformaticians need to do when they'd really like to be just using the tools, to writing the build infrastructure from scratch for your own tools, which is a software development task and in a different category.

So the HTSlib etc maintainers go to great lengths to make building HTSlib, SAMtools, etc Just Work, but there is less pressure to provide full instructions and Just-Workingness for people writing their own tools that use HTSlib, as there is a greater expectation that such people know what they're doing.

To get this program to compile, you are missing -lz from your link command. And something similar for the other errors. Unfortunately one of the skills you are going to have to learn in order to write a tool like this is the skill of debugging build issues like this one. There aren't many shortcuts other than experience, and help and advice from a more experienced programmer or sysadmin at your institute who can look over your shoulder at your screen.

The way to gain that experience starts with googling those error messages to figure out what you need to add to get this to compile and link. And learning about the different stages (compiling, linking, installing) involved in building a C program.

Thanks, will start to read up on it.

