Question: Bed File Of Reads To Wiggle File Of Numbers Of Features At Each Position
1
gravatar for KCC
8.1 years ago by
KCC4.0k
Cambridge, MA
KCC4.0k wrote:

This is probably really simple, but how do you go from a BED file of genomic features to a wiggle file which contains the number of features at a given position on the chromosome? So, every position in the wiggle file gives the total number of features that overlap with that position in the chromosome.

bed wiggle • 2.7k views
ADD COMMENTlink written 8.1 years ago by KCC4.0k
4
gravatar for brentp
8.1 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

In additions to @David's solution, if your data is large, you can get directly to bigwig format by using

bedtools genomecov (genomeCoverageBed) and

then bedGraphToBigWig

ADD COMMENTlink written 8.1 years ago by brentp23k

By the way... is there also a possibility to distinguish between the strands? With my script it's possible, but absolutely not memory efficient for a whole genome, when working with NGS data... but with genomeCoverageBed it's not doable, is it?

ADD REPLYlink written 8.1 years ago by David Langenberger9.3k

David, genomeCoverageBed has a -strand option so you can specify to check reads only from a single strand. Even if it didn't, one could just use awk to filter to a specific strand, then pipe to bedtools.

ADD REPLYlink written 8.1 years ago by brentp23k

ah ok thx, didn't know that. But anyways, I always use a script, since I also want to normalize for multiple hits... I work with short RNAseq and there it happens a lot. And that is not possible with such tools.

ADD REPLYlink written 8.1 years ago by David Langenberger9.3k
3
gravatar for David Langenberger
8.1 years ago by
Deutschland
David Langenberger9.3k wrote:

I think the simplest way is to write a small perl-script...

Copy this to a file called something like bed2wig.pl:

use strict;
use warnings;

my $file = $ARGV[0];
my %hash = ();

open(FILE, "<$file") || die "cannot open $file\n";
while(<FILE>){
    chomp;
    my ($chr, $start, $end, $id, $score, $strand) = split(/\s+/,$_);
    for(my $i=$start; $i<=$end; $i++){
        $hash{$chr}{$i}++;
    }
}
close(FILE);

foreach my $chr (keys %hash){
    print "variableStep chrom=$chr\n";
    foreach my $i (sort {$a<=>$b} keys %{$hash{$chr}}){
        print "$i\t$hash{$chr}{$i}\n";
    }
}

Now call it like ./bed2wig.pl yourFile.bed > yourFile.wig

It's not very memory efficient for big files (because of the hash), but it should give you an idea of how to do it.

ADD COMMENTlink written 8.1 years ago by David Langenberger9.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1458 users visited in the last hour