error using bedtools closestbed
3
0
Entering edit mode
9.6 years ago
chxu02 ▴ 10

a.bed looks like this:

chr1 631977 631979 -2.777777778
chr1 631994 631996 31.5625
......

b.bed looks like this:

chr1 11873 11874 DDX11L1 0 +
chr1 17435 17436 MIR6859-3 0 -
......

After I ran:

closestBed -a a.bed -b b.bed -D b > c.bed

c.bed looks like this:

chr1 631977 631979 -2.777777778 . -1 -1 . -1 . -1
chr1 631994 631996 31.5625 . -1 -1 . -1 . -1
......

Both a & b were formatted using galaxy before running. Help!

next-gen • 2.3k views
ADD COMMENT
0
Entering edit mode

Thanks guys. The spaces were inappropriately added by awk:

cat b.txt | awk '{if($3 == "-") {print $2, "\t", $5-1, "\t", $5, "\t", $3} else {print $2, "\t", $4, "\t", $4+1, "\t", $3} }' > b.bed

There were no spaces in b.txt. Any suggestion?

ADD REPLY
0
Entering edit mode

Can we have a snapshot of the b.txt file? You can edit your answer and add it there.

ADD REPLY
0
Entering edit mode

No commas in awk (your script puts spaces (comma) and tabs but you need only tabs:

cat b.txt | awk '{if($3 == "-") {print $2 "\t" $5-1 "\t" $5, "\t" $3} else {print $2 "\t" $4 "\t" $4+1 "\t" $3} }' > b.bed
ADD REPLY
0
Entering edit mode
9.6 years ago

It gave me this result when I ran your example:

chr1    631977  631979  -2.777777778    chr1    17435   17436   MIR6859-3       0       -       -614542
chr1    631994  631996  31.5625 chr1    17435   17436   MIR6859-3       0       -       -614559

Check that the files are tab-delimeted, there is something wrong with formatting. It should work.

ADD COMMENT
0
Entering edit mode
9.6 years ago
TriS ★ 4.7k

Try 3 things:

1) Instead of:

closestBed -a a.bed -b b.bed -D b > c.bed

use

closestBed -a a.bed -b b.bed -D "b" > c.bed

2) Check that your file is formatted properly (are marina suggested):

head a.bed
head b.bed

if you don't get something like what you put above then do:

cat a.bed | tr '\r' '\n' > a_v2.bed #removes carriage returns and makes a new line

or if it's not tab delimited

cat a.bed | tr '--> x <--' '\t'  #where --> x is your separator is, i.e. ' ' (space) ',' (comma) etc...

3) Sort the file

sort -k1,1 -k2,2n a.bed > a_sorted.bed

--->> EDIT <<---

I just copy/pasted your input which looks like space delimited, I converted it to tab delimited:

cat a.bed | tr ' ' '\t' > a2.bed
cat b.bed | tr ' ' '\t' > b2.bed

then used closestBed:

closestBed -a a2.bed -b b2.bed -D "b"

which gave me

chr1    631977    631979    -2.777777778    chr1    17435    17436    MIR6859-3    0    -    -614542
chr1    631994    631996    31.5625    chr1    17435    17436    MIR6859-3    0    -    -614559
ADD COMMENT
0
Entering edit mode
9.6 years ago
chxu02 ▴ 10

Thanks. It's solved.

The b.txt looks like

NM_001276352 chr1 - 67092175 67134971 C1orf141
NM_000299 chr1 + 201283451 201332993 PKP1
......

When I ran

cat b.txt | awk '{if($3 == "-") {print $2 "\t" $5-1 "\t" $5 "\t" "\t" "\t" $3} else {print $2 "\t" $4 "\t" $4+1 "\t" "\t" "\t" $3} }' > b.bed

The result was properly printed. But when I ran

cat b.txt | awk '{if($3 == "-") {print $2 "\t" $5-1 "\t" $5 "\t" $6 "\t" $1 "\t" $3} else {print $2 "\t" $4 "\t" $4+1 "\t" $6 "\t" $1 "\t" $3} }' > b.bed

The last 2 columns went to second line for each original row

chr1 67134970 67134971 C1orf141
NM_001276352 -
chr1 201283451 201283452 PKP1
NM_000299 +
ADD COMMENT

Login before adding your answer.

Traffic: 1934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6