Question: Merge the same coordinate in a table?
1
gravatar for star
3 months ago by
star140
Netherlands
star140 wrote:

I like to know is there any way to merge the same coordinate in a table.

Table:

chr1    155944562   155945214   fantom_neuron   GSM1554667   9.84447    
chr1    155944562   155945214   fantom_neuron   GSM1554672   7.43630    
chr1    155944562   155945214   fantom_neuron   GSM1554678   32.77627
chr1    155945743   155946196   fantom_neuron   BAMPE        3.87072    
chr1    155945743   155946196   fantom_neuron   GSM1554666  18.14939
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939

Expected table:

chr1    155944562   155945214   fantom_neuron   GSM1554667,GSM1554672,GSM1554678     9.84447,7.43630,32.77627   
chr1    155945743   155946196   fantom_neuron   BAMPE,GSM1554666    18.14939 ,3.87072   
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939
ADD COMMENTlink modified 3 months ago by zx87547.3k • written 3 months ago by star140

Bedtools, most probably.

ADD REPLYlink written 3 months ago by WouterDeCoster39k
3
gravatar for Friederike
3 months ago by
Friederike4.2k
United States
Friederike4.2k wrote:

Agreeing with Wouter, check out bedtools merge:

bedtools merge -i foo.bed -c 4,5 -o collapse
ADD COMMENTlink modified 3 months ago by zx87547.3k • written 3 months ago by Friederike4.2k
2
gravatar for zx8754
3 months ago by
zx87547.3k
London
zx87547.3k wrote:

Using R, and data.table package:

library(data.table)

# use fread for your data
# mydata <- fread("myFile.bed")

# example data
mydata <- fread("
chr1    155944562   155945214   fantom_neuron   GSM1554667   9.84447    
chr1    155944562   155945214   fantom_neuron   GSM1554672   7.43630    
chr1    155944562   155945214   fantom_neuron   GSM1554678   32.77627
chr1    155945743   155946196   fantom_neuron   BAMPE        3.87072    
chr1    155945743   155946196   fantom_neuron   GSM1554666  18.14939
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939
")

# then group by paste
mydata[, lapply(.SD, toString), .SDcols = c(5:6), by = list(V1, V2, V3, V4) ]
#      V1        V2        V3            V4                                 V5                        V6
# 1: chr1 155944562 155945214 fantom_neuron GSM1554667, GSM1554672, GSM1554678 9.84447, 7.4363, 32.77627
# 2: chr1 155945743 155946196 fantom_neuron                  BAMPE, GSM1554666         3.87072, 18.14939
# 3: chr1 155945746 155946202 fantom_neuron                         GSM1554655                   1.14939
ADD COMMENTlink written 3 months ago by zx87547.3k
2

or data.table::foverlaps()

ADD REPLYlink modified 3 months ago • written 3 months ago by Friederike4.2k
1

Hmm, foverlaps is when we have 2 beds to merge. If you are thinking of merging to itself, feel free to post as answer.

ADD REPLYlink written 3 months ago by zx87547.3k

yes, you're right! Ignored that detail :)

ADD REPLYlink written 3 months ago by Friederike4.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1113 users visited in the last hour