Counting Features In A Bed File
2
1
Entering edit mode
11.5 years ago
k.nirmalraman ★ 1.1k

I have a file in the following BED format

Chr1 1022071 1022105  +      
Chr1 1022071 1022105  +
Chr1 1022072 1022106  -  
Chr1 1022072 1022106  - 
Chr1 1022072 1022106  -
Chr1 1022072 1022106  -

I am trying get the counts of each feature represented in this file.

mergeBed -i R5_chr.bed -n -s -d 0 > Output/R5_chr_counts.bed

I am interested in the counts of the features and I do not want to merge features by any number of base pairs. Then the output should be as follows

Chr1 1022071 1022105 2 +
Chr1 1022072 1022106 4 +

Any suggestions on how to achieve this using bedtools or in bash or awk? Thanks in advance!

bedtools bash awk • 6.2k views
ADD COMMENT
5
Entering edit mode
11.5 years ago

Based on the example you've given this should work:

sort R5_chr.bed | uniq -c | awk '{ print $2,$3,$4,$1,$5}' > Output/R5_chr_counts.bed

Giving this output:

Chr1 1022071 1022105 2 +
Chr1 1022072 1022106 4 -

If the BED file is already sorted you can omit the initial sort command:

uniq -c R5_chr.bed | awk '{ print $2,$3,$4,$1,$5}' > Output/R5_chr_counts.bed
ADD COMMENT
0
Entering edit mode

Thank you very much!! This worked perfectly to my need :)

ADD REPLY
2
Entering edit mode
11.5 years ago
zx8754 11k
sort <file> | uniq --count

Find duplicate lines in a file and count how many time each line was duplicated: http://stackoverflow.com/questions/6712437/find-duplicate-lines-in-a-file-and-count-how-many-time-each-line-was-duplicated

ADD COMMENT

Login before adding your answer.

Traffic: 1477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6