Question

How To Write Data In A Granges Object To A Bed File.

11

Entering edit mode

11.2 years ago

Ram ▴ 190

Can anybody suggest how to write Granges object list to bed file?

Thanks a lot.

bed • 60k views

ADD COMMENT • link updated 2.5 years ago by Johan Zicola ▴ 70 • written 11.2 years ago by Ram ▴ 190

4

Entering edit mode

Assuming below that the object gr is your GRanges object:

library(rtracklayer)
export.bed(gr,con='granges.bed')
## or in the case of a GRangesList
export.bed(unlist(gr.list),con='granges.bed')

ADD REPLY • link 2.9 years ago by benformatics 4.1k

0

Entering edit mode

Since this is pretty old I think the granges got updated, for me I just use

df=data.frame(gr@unlistData)

which gives a data frame:

seqnames       start       end       width       strand
1       chr4       448829       448831     3      +

your data frame may have more columns, my granges doesn't have anything else in it other than these columns.

ADD REPLY • link 4.5 years ago by christophersugai • 0

Ram · Answer 1 · 2013-12-19

34

Entering edit mode

11.2 years ago

Devon Ryan 105k

Given a GRanges object:

gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
  ranges = IRanges(1:10, end = 7:16, names = head(letters, 10)),
  strand = Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)))

You can simply:

df <- data.frame(seqnames=seqnames(gr),
  starts=start(gr)-1,
  ends=end(gr),
  names=c(rep(".", length(gr))),
  scores=c(rep(".", length(gr))),
  strands=strand(gr))

write.table(df, file="foo.bed", quote=F, sep="\t", row.names=F, col.names=F)

to write that to foo.bed. The only trick is remembering the BED uses 0-based coordinates. If you have a GRangesList rather than a GRanges object, just use unlist(gr) in place of gr (things should still be in the same order).

ADD COMMENT • link updated 4.0 years ago by Ram 44k • written 11.2 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks you so much for your reply!! I will try with it.

ADD REPLY • link 11.2 years ago by Ram ▴ 190

0

Entering edit mode

Dear dpryan,

I have similar kind if Granges mentioned above:

GRanges with 2515 ranges and 6 metadata columns:
       seqnames               ranges strand   |             Conc
          <Rle>            <IRanges>  <Rle>   |        <numeric>
   851     chrI [15059848, 15071787]      *   | 16.4150832178115
  1412   chrIII [  249517,   252803]      *   | 9.93391864180872
  3416     chrX [  108921,   114715]      *   | 14.5870600573661
  2224    chrIV [  851252,   855627]      *   | 10.5743489064907
  1604   chrIII [ 4431526,  4439773]      *   | 11.0537011405054
   ...      ...                  ...    ... ...              ...
  2453    chrIV [ 7011494,  7013670]      *   | 9.54973169373811
  2897     chrV [ 1743061,  1744363]      *   | 8.42611771396342
  3075     chrV [ 8460316,  8461383]      *   | 8.19169221695555
  2163   chrIII [13529231, 13531151]      *   | 9.38284126048039
  2655    chrIV [11863005, 11864250]      *   | 8.41453042457874

But with this i am finding difficult to convert to bed file? Can you suggest anything for this.

Thanks a lot for your help!

ADD REPLY • link updated 4.0 years ago by Ram 44k • written 11.1 years ago by Ram ▴ 190

1

Entering edit mode

The same basic process should work. If you want the Conc values to be stored in the scores column of the BED file, then just replace the scores=... line above with something like scores=elementMetadata(gr)$Conc, (assuming the GRanges is named gr).

ADD REPLY • link 11.1 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks a lot for your reply. But the problem in it is that there are 2515 ranges and in result it is only showing 10 ranges, how it can be possible? Thanks

ADD REPLY • link 11.1 years ago by Ram ▴ 190

2

Entering edit mode

The following works for me:

$ cat foo.txt

name    start    stop    Conc
chrI    15059848    15071787    16.4150832178115
chrIII    249517    252803    9.93391864180872
chrX    108921    114715    14.5870600573661
chrIV    851252    855627    10.5743489064907
chrIII    4431526    4439773    11.0537011405054
chrIV    7011494    7013670    9.54973169373811
chrV    1743061    1744363    8.42611771396342
chrV    8460316    8461383    8.19169221695555
chrIII    13529231    13531151    9.38284126048039
chrIV    11863005    11864250    8.41453042457874
chrIV    11863006    11864251    9.41453042457874
chrIV    11863007    11864252    7.41453042457874

And then in R:

library(GenomicRanges)
d <- read.delim("foo.txt", header=T)
gr <- GRanges(seqnames=Rle(d$name),
    ranges = IRanges(d$start, end=d$stop),
    strand = Rle(strand(c(rep("*", length(d$name))))),
    Conc = d$Conc)

df <- data.frame(seqnames=seqnames(gr),
  starts=start(gr)-1,
  ends=end(gr),
  names=c(rep(".", length(gr))),
      scores=elementMetadata(gr)$Conc,
      strands=strand(gr))
    write.table(df, file="foo.bed", quote=F, sep="\t", row.names=F, col.names=F)

You might have to convert the "*" strands to ".", I don't recall off-hand what the BED format requires there.

ADD REPLY • link updated 4.0 years ago by Ram 44k • written 11.1 years ago by Devon Ryan 105k

0

Entering edit mode

today I needed this, googled it and got this hit, what useful site :-)

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 10.7 years ago by Istvan Albert 102k

3

Entering edit mode

Ditto! Especially when you need to be able to do this asap because someone's conference is over the weekend and they want some additional graphs!!

Thank you Istvan and Devon for tirelessly helping everybody out!

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Anna S ▴ 520

0

Entering edit mode

Hi, It is an old thread, but I have a naive question about it that I couldn't find anywhere.

if one uses an input from UCSC and do:

makeGRangesFromDataFrame(UCSC_table,seqnames.field ="chr",start.field="Start",end.field="End", ignore.strand=T,starts.in.df.are.0based=TRUE,keep.extra.columns=TRUE)->UCSC_table_GR

If one uses starts.in.df.are.0based=TRUE , is it still necessary to use

starts=start(gr)-1

as explained in your comment?

Thank you

ADD REPLY • link 4.5 years ago by docdot • 0

0

Entering edit mode

Great tip. I don't know if it is only me but I found that this grange to dataframe conversion method converted rounds numbers such as 1000000 in scientific notation (1e+6), which causes trouble in a bed file. My workaround was to impose non-scientific notation when getting the starts and ends variable:

df <- data.frame(seqnames=seqnames(gr), starts=format(start(gr)-1, scientific=F), ends=format(end(gr), scientific=F))

ADD REPLY • link 2.5 years ago by Johan Zicola ▴ 70

score 13 · Answer 2 · 2013-12-20

13

Entering edit mode

11.2 years ago

vj ▴ 520

You can try the rtracklayer package. It gives you options to export in various formats including the bed format.

ADD COMMENT • link 11.2 years ago by vj ▴ 520

0

Entering edit mode

In particular the export() function is what the OP is looking for.

ADD REPLY • link 8.4 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

If anyone is wondering. Given gr your GRanges object you can produce a bed file this easily:

library(rtracklayer)
export.bed(gr,con='granges.bed')

ADD REPLY • link 2.9 years ago by benformatics 4.1k

0

Entering edit mode

Probably: file='granges.bed' parameter is not valid

library(rtracklayer)
export.bed(gr, con = 'granges.bed' )

As of 9th March, 2022

ADD REPLY • link 3.0 years ago by Ahsan • 0

Ram · Answer 3 · 2015-09-18

0

Entering edit mode

9.4 years ago

Zhilong Jia ★ 2.2k

get a DataFrame object by mcols(gr) and then write out.

ref: The GenomicRanges vignette

ADD COMMENT • link updated 5.1 years ago by Ram 44k • written 9.4 years ago by Zhilong Jia ★ 2.2k

0

Entering edit mode

mcols() gives you the extra (meta) columns, but not the coordinates, which are what's really needed for a BED file. In fact the minimal BED file representation of GRanges object doesn't require any of those columns.

ADD REPLY • link 9.4 years ago by Devon Ryan 105k

0

Entering edit mode

You're right. The result of `mcols()` actually give me what I need instead of bed file. I forget the author's question. Thank you.

ADD REPLY • link 9.4 years ago by Zhilong Jia ★ 2.2k

0

Entering edit mode

You may need to substract 1 from the start coordinates.

ADD REPLY • link 8.4 years ago by Giovanni M Dall'Olio 28k

score 0 · Answer 4 · 2016-09-14

If you only have one metadata column and you would like to keep it, this modification of Devon Ryan's answer works:

df <- data.frame(seqnames=seqnames(gr),
starts=start(gr)-1,
ends=end(gr),
names=c(rep(".", length(gr))),
scores=elementMetadata(gr)[,1],
strands=strand(gr)

For my data this gives:

  seqnames   starts     ends names scores strands
1     chrY 10515750 10515760     .      1       *
2     chrY 10519610 10519620     .      1       *
3     chrY 10534770 10534780     .      1       *
4     chrY 10540160 10540170     .      1       *
5     chrY 10554860 10554870     .      1       *
6     chrY 10560630 10560640     .      1       *

rpolicastro · Answer 5 · 2021-02-01

0

Entering edit mode

4.0 years ago

cjgunase ▴ 50

This worked for me.

BiocManager::install("Repitools")

library('Repitools')

df <- annoGR2DF(gr)

ADD COMMENT • link updated 4.0 years ago by rpolicastro 13k • written 4.0 years ago by cjgunase ▴ 50