Question: Per-base coverage to bedgraph or bigwig
0
gravatar for pyKey
7 weeks ago by
pyKey40
pyKey40 wrote:

Hi,

I am trying to turn my customized per-base counts of a bam file into bigwig format. From a bedgraph seems there is already a tool : bedGraphToBigWig. Is there already a tool which turns a per-base coverage report to bedgraph format?

Thank you,

bam coverage bigwig bedtools • 162 views
ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by pyKey40
2

a simple awk ? awk '{printf("%s\t%d\t%s\t%s\n",$1,int($2)-1,$2,$3);}'

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Pierre Lindenbaum118k

customized per-base counts

Which looks like...?

ADD REPLYlink written 7 weeks ago by ATpoint14k

Hey,

The format is the regular per-base coverage:

I   1   0.5
I   2   1.0
I   3   1.0
I   11  3.5
I   12  3.5
I   13  3.5
I   14  4.0
I   18  5.5
I   19  5.5
I   20  6.0
I   21  6.0

Just that I cannot simply get it from samtools depth nor by bedtools genomecov (which would make my life so easy). Coverage here is calculated by investigating reads and their number of multimaps. Anyways thank you for your help. I guess the awk line would do the work.

Cheers,

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by pyKey40

See Gautier's comment, which I'm moving to an answer, for the correct format.

ADD REPLYlink written 7 weeks ago by Devon Ryan88k

Thanks Devon Ryan,

Gautier Richard's sample looks like what I get from "bedtools genomecov -bg". That I can easily imitate in my existing file. I am not sure though if it is fully compliant with the definition of bedgraph files from bedtools site:

"Whereas the -d option reports an output line describing the observed coverage at each and every position in the genome, the -bg option instead produces genome-wide coverage output in BEDGRAPH format. This is a much more concise representation since consecutive positions with the same coverage are reported as a single output line describing the start and end coordinate of the interval having the coverage level, followed by the coverage level itself."

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by pyKey40

What Gautier posted is fully compliant with the format.

ADD REPLYlink written 7 weeks ago by Devon Ryan88k

Exactly! My question is that if I construct a file like this:

I   1   1     0.5
I   2   2     1.0
I   3   3     1.0
I   11  11  3.5
I   12  12  3.5
I   13  13  3.5
I   14  14  4.0
I   18  18  5.5
I   19  18  5.5

out of the previous one, would it be correct, or do I need to merge consecutive regions first?

ADD REPLYlink written 7 weeks ago by pyKey40
3
gravatar for Gautier Richard
7 weeks ago by
MPI IE, Freiburg, Germany
Gautier Richard270 wrote:

Per base bedgraph should be of the following format:

Chr1 0 1 0.846 
Chr1 1 2 0.745 
Chr1 2 3 0.433
Chr1 3 4 0.313

Then you can convert it to bigwig easily using bedGraphToBigWig from UCSCTools.

Having your "custom file" should help us to guide you about how to get the bedgraph format.

EDIT: Following the above comments and what your file looks like, consider this file:

per_base_score.txt

I   1   0.5
I   2   1.0
I   3   1.0

I guess that a simple awk should give you what you are looking for, as suggested by Pierre in the comments:

awk '{print $1"\t"$2-1"\t"$2"\t"$3}' per_base_score.txt > per_base_score.bedgraph
bedGraphToBigWig per_base_score.bedgraph chrom.sizes per_base_score.bw

Be sure to get chrom.sizes from (i.e. for yeast, as I guess with you chromosome names): https://hgdownload-test.gi.ucsc.edu/goldenPath/sacCer3/bigZips/sacCer3.chrom.sizes

If you have a custom genome (i.e. non-model organism), you can extract chrom.sizes from a fasta file following the recommendations there: Get chromosome sizes from fasta file

Be sure that your bedgraph / bigwig chromosome names are matching the ones from your other bed files for downstream analyses (chrI vs I).

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by Gautier Richard270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2312 users visited in the last hour