Question: bedGraphToBigWig error - end coordinate bigger than chr
0
gravatar for varsha619
4 months ago by
varsha61930
varsha61930 wrote:

Does anyone know a way to fix the bedGraphToBigWig error - end coordinate bigger than chr? My input is a bedGraph generated using MACS2. This link suggests using bedClip - https://groups.google.com/forum/embed/#!topic/macs-announcement/gXdf115Xy5Q. But I would like to know if there is a command line option to fix it. Thank you for your help.

bedgraphtobigwig macs2 • 283 views
ADD COMMENTlink modified 4 months ago by Alex Reynolds21k • written 4 months ago by varsha61930

Fixed it with bedClip, thank you for your help!

ADD REPLYlink written 4 months ago by varsha61930
1
gravatar for genecats.ucsc
4 months ago by
genecats.ucsc420
genecats.ucsc420 wrote:

bedClip is a command line program to do this:

bedClip input.bed http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes output.bed

You can download bedClip from the directory appropriate to your operating system within our directory of utilities.

If you have questions about running bedClip, feel free to send a question to one of our mailing lists:

  • genome@soe.ucsc.edu for general questions
  • genome-mirror@soe.ucsc.edu for questions involving mirrors or gbibs
  • genome-www@soe.ucsc.edu for questions involving private data

ChrisL from the UCSC Genome Browser

ADD COMMENTlink written 4 months ago by genecats.ucsc420
0
gravatar for Pierre Lindenbaum
4 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum101k wrote:

generate a awk script that will clip your bed records:

mysql --user=genome -N --host=genome-mysql.cse.ucsc.edu -A -D hg19  -e 'select chrom,size from chromInfo '  |\
awk '{printf("($1==\"%s\") {L=%d;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf(\"%s\\t%%d\\t%%d\\n\",B,E);next;}\n",$1,$2,$1);}' > script.awk



$ head  script.awk
($1=="chr1") {L=249250621;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr1\t%d\t%d\n",B,E);next;}
($1=="chr2") {L=243199373;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr2\t%d\t%d\n",B,E);next;}
($1=="chr3") {L=198022430;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr3\t%d\t%d\n",B,E);next;}
($1=="chr4") {L=191154276;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr4\t%d\t%d\n",B,E);next;}
($1=="chr5") {L=180915260;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr5\t%d\t%d\n",B,E);next;}
($1=="chr6") {L=171115067;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr6\t%d\t%d\n",B,E);next;}
($1=="chr7") {L=159138663;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr7\t%d\t%d\n",B,E);next;}
($1=="chrX") {L=155270560;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chrX\t%d\t%d\n",B,E);next;}
($1=="chr8") {L=146364022;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr8\t%d\t%d\n",B,E);next;}
($1=="chr9") {L=141213431;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr9\t%d\t%d\n",B,E);next;}

then use this awk script :

awk -f  script.awk input.bed
ADD COMMENTlink written 4 months ago by Pierre Lindenbaum101k
0
gravatar for Alex Reynolds
4 months ago by
Alex Reynolds21k
Seattle, WA USA
Alex Reynolds21k wrote:

To solve this problem more generically, make a BED file and use that as a mask with BEDOPS bedops --element-of:

$ fetchChromSizes hg38 | awk '{ print $1"\t0\t"$2; }" | sort-bed - > hg38.bed
$ bedops --element-of 1 in.bedGraph hg38.bed > masked.in.bedGraph

Then convert the masked bedGraph file to Wiggle format.

But mainly I'd be concerned about having signal get generated in regions that don't or shouldn't exist. That might point to a potential data problem or code smell, somewhere.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Alex Reynolds21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 717 users visited in the last hour