Question: bedGraphToBigWig error - end coordinate bigger than chr
0
gravatar for varsha619
11 weeks ago by
varsha61920
varsha61920 wrote:

Does anyone know a way to fix the bedGraphToBigWig error - end coordinate bigger than chr? My input is a bedGraph generated using MACS2. This link suggests using bedClip - https://groups.google.com/forum/embed/#!topic/macs-announcement/gXdf115Xy5Q. But I would like to know if there is a command line option to fix it. Thank you for your help.

bedgraphtobigwig macs2 • 211 views
ADD COMMENTlink modified 11 weeks ago by Alex Reynolds20k • written 11 weeks ago by varsha61920

Fixed it with bedClip, thank you for your help!

ADD REPLYlink written 9 weeks ago by varsha61920
1
gravatar for genecats.ucsc
11 weeks ago by
genecats.ucsc340
genecats.ucsc340 wrote:

bedClip is a command line program to do this:

bedClip input.bed http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes output.bed

You can download bedClip from the directory appropriate to your operating system within our directory of utilities.

If you have questions about running bedClip, feel free to send a question to one of our mailing lists:

  • genome@soe.ucsc.edu for general questions
  • genome-mirror@soe.ucsc.edu for questions involving mirrors or gbibs
  • genome-www@soe.ucsc.edu for questions involving private data

ChrisL from the UCSC Genome Browser

ADD COMMENTlink written 11 weeks ago by genecats.ucsc340
0
gravatar for Pierre Lindenbaum
11 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum98k wrote:

generate a awk script that will clip your bed records:

mysql --user=genome -N --host=genome-mysql.cse.ucsc.edu -A -D hg19  -e 'select chrom,size from chromInfo '  |\
awk '{printf("($1==\"%s\") {L=%d;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf(\"%s\\t%%d\\t%%d\\n\",B,E);next;}\n",$1,$2,$1);}' > script.awk



$ head  script.awk
($1=="chr1") {L=249250621;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr1\t%d\t%d\n",B,E);next;}
($1=="chr2") {L=243199373;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr2\t%d\t%d\n",B,E);next;}
($1=="chr3") {L=198022430;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr3\t%d\t%d\n",B,E);next;}
($1=="chr4") {L=191154276;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr4\t%d\t%d\n",B,E);next;}
($1=="chr5") {L=180915260;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr5\t%d\t%d\n",B,E);next;}
($1=="chr6") {L=171115067;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr6\t%d\t%d\n",B,E);next;}
($1=="chr7") {L=159138663;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr7\t%d\t%d\n",B,E);next;}
($1=="chrX") {L=155270560;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chrX\t%d\t%d\n",B,E);next;}
($1=="chr8") {L=146364022;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr8\t%d\t%d\n",B,E);next;}
($1=="chr9") {L=141213431;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr9\t%d\t%d\n",B,E);next;}

then use this awk script :

awk -f  script.awk input.bed
ADD COMMENTlink written 11 weeks ago by Pierre Lindenbaum98k
0
gravatar for Alex Reynolds
11 weeks ago by
Alex Reynolds20k
Seattle, WA USA
Alex Reynolds20k wrote:

To solve this problem more generically, make a BED file and use that as a mask with BEDOPS bedops --element-of:

$ fetchChromSizes hg38 | awk '{ print $1"\t0\t"$2; }" | sort-bed - > hg38.bed
$ bedops --element-of 1 in.bedGraph hg38.bed > masked.in.bedGraph

Then convert the masked bedGraph file to Wiggle format.

But mainly I'd be concerned about having signal get generated in regions that don't or shouldn't exist. That might point to a potential data problem or code smell, somewhere.

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by Alex Reynolds20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1718 users visited in the last hour