Perl Script for Bed file preparation for CpGI, CpG Shore, CpG shelf regions
0
1
Entering edit mode
8.7 years ago
Shicheng Guo ★ 9.4k

Dear Guys,

Can you share a perl/R/Python/Java script for CpGI, CpG Shore, CpG shelf regions bed file preparation?

Thanks

Perl Bed CpG • 3.9k views
ADD COMMENT
3
Entering edit mode

I tried to write a perl code to subtract 2000bp and add 2000 to the CpG island region to get CpG shore regions (55436 CpG shores) [Note: I removed chrUn, random and hap regions from hg19 genomes]. and then use bedtools subtract to remove CpG shores overlapped with CpG island (48790) and then merge the CpG shores which are overlapped within CpG shore regions (46817 CpG shore). Eventually, 46817 CpG shore region were obtained. and you can download this annotation: hg19.cpgshoreExt.txt

Similar idea, we can obtain genome regions for CpG shelf regions. 93634 regions were obtain from CpG shore files. 44844 remains after remove regions overlapping with CpG island, 41934 regions remained after remove regions overlapping with CpG shores and then 40812 regions remained after merge within CpG shelf. hg19.cpgshelfExt.bed can be downloaded.

#!/usr/bin/perl
use strict;
use Cwd;

# Creat CpG shore region based on CpG island 

my $CGI=@ARGV[0];
print "Reference CGI file is $CGI\n";

open F,$CGI || die "Cannot open $CGI";
open OUT,">hg19.cpgShoreExt.sort.txt" || die "Cannot open $CGI";

while(<F>){
my @line=split /\t/;
my $start=$line[1]-2000;
my $end=$line[2]+2000;
my $ID1="$line[0]:$start-$line[1]";
my $ID2="$line[0]:$line[2]-$end";

my $tmp2=$line[0]\t$stat\t$line[1]\t$ID1\n";
my $tmp2="$line[0]\t$line[2]\t$end\t$ID2\n";

print $tmp1;
print $tmp2;

print OUT $tmp1;
print OUT $tmp2;

}
ADD REPLY
1
Entering edit mode

Please elaborate on your requirement, and also give us some details on what you've tried.

ADD REPLY
0
Entering edit mode

Check this solution too,

ADD REPLY

Login before adding your answer.

Traffic: 2656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6