Question: [Resolved] Extract Top score peaks from a .bed
0
gravatar for giroudpaul
2.1 years ago by
giroudpaul50
European Union
giroudpaul50 wrote:

Hello,

I have troubles extracting the best score peaks from my bed file.

I have something like this :

chr1    4483445    4484123    MACS_peak_1    59.19
chr1    4533929    4534670    MACS_peak_2    291.66
chr1    4561472    4562345    MACS_peak_3    492.41
chr1    4661528    4662199    MACS_peak_4    72.40
chr1    4695864    4697136    MACS_peak_5    204.10
chr1    4745953    4746562    MACS_peak_6    420.59

and I would like to extract just the peaks that are above a threshold for the score (ex : 100) (5th column)

I have tried with bedtools, but it can just sort them by chromosome then score, and i don't find how to extract just the lines with a score above 100.

I succeed to extract them with unix using sort then tail, but afterward, when I write the output, and try to work with the file, I have an issue since unix don't write the 5th column as integer but as numerical.

Is there anyhow a better way to do it ?

Thank you

extract bed • 767 views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by giroudpaul50
1
gravatar for giroudpaul
2.1 years ago by
giroudpaul50
European Union
giroudpaul50 wrote:

Well, this just worked, thank you for the ultra quick answer.

I wanted to restrain my bed file before putting it into GREAT, and I told me it won't accept it because 5th column was not an interger, so that's were I didn't understood.

Anyway, thank you

Edit : actually, it still don't work with GREAT...

Error details:
Line X: The fifth field (score) must be an integer (not 291.66). Input data: 'chr1 4533929 4534670 MACS_peak_2 291.66'

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by giroudpaul50
1

Then you'll need to round:

awk 'BEGIN{OFS="\t"}{if($5>100) {$5 = int($5); print $0}}' input.bed > output.bed

This always rounds down, but you could do things differently if needed.

ADD REPLYlink written 2.1 years ago by Devon Ryan73k

This work just fine, thank you !

ADD REPLYlink written 2.1 years ago by giroudpaul50
0
gravatar for Devon Ryan
2.1 years ago by
Devon Ryan73k
Freiburg, Germany
Devon Ryan73k wrote:
awk '{if($5>100) print $0}' input.bed > output.bed

The fifth column isn't an interger, so I'm not sure why you're surprised that it's not automagically being made into one.

ADD COMMENTlink written 2.1 years ago by Devon Ryan73k

For brevity, you can also do:

$ awk '$5 > 100' input.bed > output.bed
ADD REPLYlink written 2.1 years ago by Alex Reynolds21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 944 users visited in the last hour