Question: Merge the same coordinate in a table?
1
gravatar for star
9 months ago by
star190
Netherlands
star190 wrote:

I like to know is there any way to merge the same coordinate in a table.

Table:

chr1    155944562   155945214   fantom_neuron   GSM1554667   9.84447    
chr1    155944562   155945214   fantom_neuron   GSM1554672   7.43630    
chr1    155944562   155945214   fantom_neuron   GSM1554678   32.77627
chr1    155945743   155946196   fantom_neuron   BAMPE        3.87072    
chr1    155945743   155946196   fantom_neuron   GSM1554666  18.14939
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939

Expected table:

chr1    155944562   155945214   fantom_neuron   GSM1554667,GSM1554672,GSM1554678     9.84447,7.43630,32.77627   
chr1    155945743   155946196   fantom_neuron   BAMPE,GSM1554666    18.14939 ,3.87072   
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939
ADD COMMENTlink modified 9 months ago by zx87548.8k • written 9 months ago by star190

Bedtools, most probably.

ADD REPLYlink written 9 months ago by WouterDeCoster42k
3
gravatar for Friederike
9 months ago by
Friederike5.2k
United States
Friederike5.2k wrote:

Agreeing with Wouter, check out bedtools merge:

bedtools merge -i foo.bed -c 4,5 -o collapse
ADD COMMENTlink modified 9 months ago by zx87548.8k • written 9 months ago by Friederike5.2k
2
gravatar for zx8754
9 months ago by
zx87548.8k
London
zx87548.8k wrote:

Using R, and data.table package:

library(data.table)

# use fread for your data
# mydata <- fread("myFile.bed")

# example data
mydata <- fread("
chr1    155944562   155945214   fantom_neuron   GSM1554667   9.84447    
chr1    155944562   155945214   fantom_neuron   GSM1554672   7.43630    
chr1    155944562   155945214   fantom_neuron   GSM1554678   32.77627
chr1    155945743   155946196   fantom_neuron   BAMPE        3.87072    
chr1    155945743   155946196   fantom_neuron   GSM1554666  18.14939
chr1    155945746   155946202   fantom_neuron   GSM1554655  1.14939
")

# then group by paste
mydata[, lapply(.SD, toString), .SDcols = c(5:6), by = list(V1, V2, V3, V4) ]
#      V1        V2        V3            V4                                 V5                        V6
# 1: chr1 155944562 155945214 fantom_neuron GSM1554667, GSM1554672, GSM1554678 9.84447, 7.4363, 32.77627
# 2: chr1 155945743 155946196 fantom_neuron                  BAMPE, GSM1554666         3.87072, 18.14939
# 3: chr1 155945746 155946202 fantom_neuron                         GSM1554655                   1.14939
ADD COMMENTlink written 9 months ago by zx87548.8k
2

or data.table::foverlaps()

ADD REPLYlink modified 9 months ago • written 9 months ago by Friederike5.2k
1

Hmm, foverlaps is when we have 2 beds to merge. If you are thinking of merging to itself, feel free to post as answer.

ADD REPLYlink written 9 months ago by zx87548.8k

yes, you're right! Ignored that detail :)

ADD REPLYlink written 9 months ago by Friederike5.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1796 users visited in the last hour