Question: Edit score column in bed file
0
gravatar for fr
17 months ago by
fr130
fr130 wrote:

I have a number of chr regions that I'd like to edit in a bed file. How would you do this in bedtools? My appologies, I feel that I am just lacking the correct term to search for it and get a good answer.

For instance, I currently have a list of chromosome regions which I'd like to set as NA or 0 in a bedgraph file. The list of regions is currently in no specific format (have it in python for now).

Let's say my bedgraph file looks like this:

track name="myfile"  yLineMark="0.0" alwaysZero=on maxHeightPixels=100:75:11 visibility=full viewLimits=-1:1 autoScale=on type=bedGraph
chr1    3000000 3050000 -0.87
chr1    3050000 3100000 -0.87
chr1    3100000 3150000 -0.84
chr1    3150000 3200000 -0.86
chr1    3200000 3250000 -0.89

and for some regions, which I have processed before, I want to convert the scores to NA or 0. For instance, chr1_3050000_31000000 and chr1_3150000_3200000. The end result would be something that you could load as a track but would show those regions as null or blanks (hence my suggestion for NA or 0):

track name="myfile"  yLineMark="0.0" alwaysZero=on maxHeightPixels=100:75:11 visibility=full viewLimits=-1:1 autoScale=on type=bedGraph
chr1    3000000 3050000 -0.87
chr1    3050000 3100000 NA
chr1    3100000 3150000 -0.84
chr1    3150000 3200000 NA
chr1    3200000 3250000 -0.89

Thanks a lot in advance

bed bedtools • 791 views
ADD COMMENTlink modified 17 months ago by Alex Reynolds30k • written 17 months ago by fr130
1

Hello rf ,

please give us an example of input bed file, the regions file and your desired output. Thus it is much clearer for us what you are trying to do.

Thanks.

fin swimmer

ADD REPLYlink written 17 months ago by finswimmer13k

True @finswimmer, I missed that. I've made a few edits, hopefully it will be clearer.

ADD REPLYlink written 17 months ago by fr130

Why set to NA, maybe just exclude, like set difference, something like grep -f file1 file2 ?

ADD REPLYlink written 17 months ago by zx87549.6k
2
gravatar for tshtatland
17 months ago by
tshtatland60
United States
tshtatland60 wrote:

My preferred choice is to write the regions as a bed file from your code (python, etc). Then use standard bioinformatics tools, here bedtools intersect with option -loj (Left outer join. Report features in A with and without overlaps. See: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html#loj-left-outer-join-report-features-in-a-with-and-without-overlaps. Follow up with a little bit of perl, a language commonly used for simple, quick in-line text processing (or python, if you want). Note that I stripped the header for clarity - feel free to add it if needed:

% cat a.bed
chr1    3000000 3050000 -0.87
chr1    3050000 3100000 -0.87
chr1    3100000 3150000 -0.84
chr1    3150000 3200000 -0.86
chr1    3200000 3250000 -0.89
# Your regions with NAs, dumped as bed:
% cat b.bed
chr1    3050000 3100000 NA
chr1    3150000 3200000 NA
% bedtools intersect -a a.bed -b b.bed -loj | \
    perl -F'\t' -lane '
print join "\t", @F[0,1,2],
                 ( $F[-1] eq q{.} ? $F[3] : $F[-1] ); '
chr1    3000000 3050000 -0.87
chr1    3050000 3100000 NA
chr1    3100000 3150000 -0.84
chr1    3150000 3200000 NA
chr1    3200000 3250000 -0.89
ADD COMMENTlink written 17 months ago by tshtatland60
1
gravatar for Alex Reynolds
17 months ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

The file all-regions.bg contains all regions and scores in sort-bed-sorted bedgraph format.

The file regions-you-want-to-remove.bed3 contains regions-to-be-removed in sort-bed-sorted BED3 format (just the genomic intervals, no scores). These regions should entirely match those in all-regions.bg, but lack score data.

One-liner pipeline:

$ bedops -n 100% all-regions.bg regions-you-want-to-remove.bed3 | bedops -u - <(awk -vFS="\t" -vOFS="\t" '{ print $0, "NA" }' regions-you-want-to-remove.bed3) > answer.bg

The file answer.bg is a sort-bed-sorted bedgraph file containing all regions and a mix of numerical and NA scores.

ADD COMMENTlink written 17 months ago by Alex Reynolds30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 982 users visited in the last hour