Question: Bed File Of Reads To Wiggle File Of Numbers Of Features At Each Position
1
gravatar for KCC
6.7 years ago by
KCC3.9k
Cambridge, MA
KCC3.9k wrote:

This is probably really simple, but how do you go from a BED file of genomic features to a wiggle file which contains the number of features at a given position on the chromosome? So, every position in the wiggle file gives the total number of features that overlap with that position in the chromosome.

bed wiggle • 2.3k views
ADD COMMENTlink written 6.7 years ago by KCC3.9k
4
gravatar for brentp
6.7 years ago by
brentp22k
Salt Lake City, UT
brentp22k wrote:

In additions to @David's solution, if your data is large, you can get directly to bigwig format by using

bedtools genomecov (genomeCoverageBed) and

then bedGraphToBigWig

ADD COMMENTlink written 6.7 years ago by brentp22k

By the way... is there also a possibility to distinguish between the strands? With my script it's possible, but absolutely not memory efficient for a whole genome, when working with NGS data... but with genomeCoverageBed it's not doable, is it?

ADD REPLYlink written 6.7 years ago by David Langenberger8.4k

David, genomeCoverageBed has a -strand option so you can specify to check reads only from a single strand. Even if it didn't, one could just use awk to filter to a specific strand, then pipe to bedtools.

ADD REPLYlink written 6.7 years ago by brentp22k

ah ok thx, didn't know that. But anyways, I always use a script, since I also want to normalize for multiple hits... I work with short RNAseq and there it happens a lot. And that is not possible with such tools.

ADD REPLYlink written 6.7 years ago by David Langenberger8.4k
3
gravatar for David Langenberger
6.7 years ago by
Deutschland
David Langenberger8.4k wrote:

I think the simplest way is to write a small perl-script...

Copy this to a file called something like bed2wig.pl:

use strict;
use warnings;

my $file = $ARGV[0];
my %hash = ();

open(FILE, "<$file") || die "cannot open $file\n";
while(<FILE>){
    chomp;
    my ($chr, $start, $end, $id, $score, $strand) = split(/\s+/,$_);
    for(my $i=$start; $i<=$end; $i++){
        $hash{$chr}{$i}++;
    }
}
close(FILE);

foreach my $chr (keys %hash){
    print "variableStep chrom=$chr\n";
    foreach my $i (sort {$a<=>$b} keys %{$hash{$chr}}){
        print "$i\t$hash{$chr}{$i}\n";
    }
}

Now call it like ./bed2wig.pl yourFile.bed > yourFile.wig

It's not very memory efficient for big files (because of the hash), but it should give you an idea of how to do it.

ADD COMMENTlink written 6.7 years ago by David Langenberger8.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1627 users visited in the last hour