Samtools Mpileup Directional Coverage
3
1
Entering edit mode
12.2 years ago
User 1874 ▴ 10

Hi Guys, I wonder if there is a quick way to use samtools to get direcitonal coverage, i.e., coverage of forward direction and reverse direction. What I can see is mpileup function may generate what I want. In column 4 of the mpileup output, it's total coverage, and in column 5 the string contains some directional information, such as upper and lower case of ACGT/acgt, but the pattern of this string is way too complicate and no help message available. I searched and find some help, but seems it's outdated. The message says dots and commas in the fifth column, which is not I get. My mpileup output 5th column is something like AAAAAAaaaaa-1naAAAA^LAA<<<>>>{} etc (a fabricated example). What I want is pretty simple, for example, chr1 pos refbase totalcoverage forwardcoverage reverse_coverage. Does anybody know how to extract such information out of mpileup output? Thanks!

samtools mpileup coverage • 3.3k views
ADD COMMENT
0
Entering edit mode
12.2 years ago
Ian 6.0k

I may have misunderstood your requirements, but you could try BEDTools genomeCoverageBed, the -strand parameter allows the calculation of coverage from either the + or - strand.

ADD COMMENT
0
Entering edit mode
12.1 years ago
Swbarnes2 ★ 1.6k

If you want to count across the whole .bam, use samtools flagstat

samtools flagstat -cf 16 -F 4

will count how many mapped reads run in the reverse direction.

samtools flagstat -cF 20

will count how many mapped reads run in the forward direction.

The pileup will work too, periods means forward direction, commas mean reverse. There are a few other chracters that indicate that a read starts or ends at that point, you could ignore those.

ADD COMMENT
0
Entering edit mode
12.1 years ago

If I'm not wrong with the macro bam1_strand the following C program should list all the positions with the forward & reverse count:

#include <stdio.h>
#include "sam.h"

// callback for bam_plbuf_init()
static int pileup_func(uint32_t tid, uint32_t pos, int n, const bam_pileup1_t *pl, void *data)
    {
    samfile_t *in=(samfile_t *)data;
    int count[2]={0,0};
    int i=0;
    for( i=0;i< n;++i)
        {
        if(bam1_strand(pl[i].b))
            {
            count[0]++;
            }
        else
            {
            count[1]++;
            }
        }
    printf("%s\t%d\t%d\t%d\n",
        in->header->target_name[tid],
        (pos+1),
        count[0],
        count[1]
        );

    return 0;
    }

int main(int argc, char *argv[])
    {
    samfile_t *in;
    if (argc == 1)
        {
        fprintf(stderr, "Usage: %s <in.bam>\n",argv[0]);
        return 1;
        }

    in = samopen(argv[1], "rb", 0);
    if (in == 0)
        {
        fprintf(stderr, "Fail to open BAM file %s\n", argv[1]);
        return 1;
        }

    sampileup(in, -1, pileup_func,in);
    samclose(in);
    return 0;
    }

compilation:

gcc -Wall -I /path/samtools/ -L /path/samtools/ input.c -lbam -lz

test:

./a.out samtools-0.1.18/examples/ex1.bam
seq1    1   0   1
seq1    2   0   1
seq1    3   0   2
seq1    4   0   2
seq1    5   0   3
seq1    6   0   4
seq1    7   0   4
seq1    8   0   4
seq1    9   0   5
seq1    10  ....
ADD COMMENT

Login before adding your answer.

Traffic: 3683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6