Question: How To Write Data In A Granges Object To A Bed File.
8
gravatar for Ram
7.2 years ago by
Ram160
Germany
Ram160 wrote:

Can anybody suggest how to write Granges object list to bed file?

Thanks a lot.

bed • 36k views
ADD COMMENTlink modified 4 weeks ago by cjgunase30 • written 7.2 years ago by Ram160
3
  1. Open whatever internet search engine you prefer (e.g., google);
  2. Enter: Granges to bed file;
  3. Profit.
ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by PoGibas4.9k
6

Your link currently gives this thread as the first hit. Profit indeed.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by eric.kern13210

It it were not for Devon you would have just created an infinite loop.

ADD REPLYlink written 9 months ago by lcordeiro30
2

Assuming below that the object gr is your GRanges object:

library(rtracklayer)
export.bed(gr,file='granges.bed')
## or in the case of a GRangesList
export.bed(unlist(gr.list),file='granges.bed')
ADD REPLYlink modified 6 months ago • written 6 months ago by benformatics2.1k

Since this is pretty old I think the granges got updated, for me I just use

df=data.frame(gr@unlistData)

which gives a data frame:

seqnames       start       end       width       strand
1       chr4       448829       448831     3      +

your data frame may have more columns, my granges doesn't have anything else in it other than these columns.

ADD REPLYlink modified 6 months ago • written 6 months ago by christophersugai0

@unlistData is a command for GRangesLists and returns a GRanges object. What you show there is also not BED format since strand info is in column 6 and width is (at least by definition) not part of it but you can of course use additional columns in any BED file. If you simply do df[,1:3] that would then be a minimal BED though.

ADD REPLYlink modified 6 months ago • written 6 months ago by ATpoint46k
25
gravatar for Devon Ryan
7.2 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

Given a GRanges object:

gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
  ranges = IRanges(1:10, end = 7:16, names = head(letters, 10)),
  strand = Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)))

You can simply:

df <- data.frame(seqnames=seqnames(gr),
  starts=start(gr)-1,
  ends=end(gr),
  names=c(rep(".", length(gr))),
  scores=c(rep(".", length(gr))),
  strands=strand(gr))

write.table(df, file="foo.bed", quote=F, sep="\t", row.names=F, col.names=F)

to write that to foo.bed. The only trick is remembering the BED uses 0-based coordinates. If you have a GRangesList rather than a GRanges object, just use unlist(gr) in place of gr (things should still be in the same order).

ADD COMMENTlink modified 4 weeks ago by Ram32k • written 7.2 years ago by Devon Ryan98k

Thanks you so much for your reply!! I will try with it.

ADD REPLYlink written 7.2 years ago by Ram160

Dear dpryan,

I have similar kind if Granges mentioned above:

GRanges with 2515 ranges and 6 metadata columns:
       seqnames               ranges strand   |             Conc
          <Rle>            <IRanges>  <Rle>   |        <numeric>
   851     chrI [15059848, 15071787]      *   | 16.4150832178115
  1412   chrIII [  249517,   252803]      *   | 9.93391864180872
  3416     chrX [  108921,   114715]      *   | 14.5870600573661
  2224    chrIV [  851252,   855627]      *   | 10.5743489064907
  1604   chrIII [ 4431526,  4439773]      *   | 11.0537011405054
   ...      ...                  ...    ... ...              ...
  2453    chrIV [ 7011494,  7013670]      *   | 9.54973169373811
  2897     chrV [ 1743061,  1744363]      *   | 8.42611771396342
  3075     chrV [ 8460316,  8461383]      *   | 8.19169221695555
  2163   chrIII [13529231, 13531151]      *   | 9.38284126048039
  2655    chrIV [11863005, 11864250]      *   | 8.41453042457874

But with this i am finding difficult to convert to bed file? Can you suggest anything for this.

Thanks a lot for your help!

ADD REPLYlink modified 4 weeks ago by Ram32k • written 7.2 years ago by Ram160
1

The same basic process should work. If you want the Conc values to be stored in the scores column of the BED file, then just replace the scores=... line above with something like scores=elementMetadata(gr)$Conc, (assuming the GRanges is named gr).

ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by Devon Ryan98k

Thanks a lot for your reply. But the problem in it is that there are 2515 ranges and in result it is only showing 10 ranges, how it can be possible? Thanks

ADD REPLYlink written 7.2 years ago by Ram160
2

The following works for me:

$ cat foo.txt

name    start    stop    Conc
chrI    15059848    15071787    16.4150832178115
chrIII    249517    252803    9.93391864180872
chrX    108921    114715    14.5870600573661
chrIV    851252    855627    10.5743489064907
chrIII    4431526    4439773    11.0537011405054
chrIV    7011494    7013670    9.54973169373811
chrV    1743061    1744363    8.42611771396342
chrV    8460316    8461383    8.19169221695555
chrIII    13529231    13531151    9.38284126048039
chrIV    11863005    11864250    8.41453042457874
chrIV    11863006    11864251    9.41453042457874
chrIV    11863007    11864252    7.41453042457874

And then in R:

library(GenomicRanges)
d <- read.delim("foo.txt", header=T)
gr <- GRanges(seqnames=Rle(d$name),
    ranges = IRanges(d$start, end=d$stop),
    strand = Rle(strand(c(rep("*", length(d$name))))),
    Conc = d$Conc)

df <- data.frame(seqnames=seqnames(gr),
  starts=start(gr)-1,
  ends=end(gr),
  names=c(rep(".", length(gr))),
      scores=elementMetadata(gr)$Conc,
      strands=strand(gr))
    write.table(df, file="foo.bed", quote=F, sep="\t", row.names=F, col.names=F)

You might have to convert the "*" strands to ".", I don't recall off-hand what the BED format requires there.

ADD REPLYlink modified 4 weeks ago by Ram32k • written 7.2 years ago by Devon Ryan98k

today I needed this, googled it and got this hit, what useful site :-)

ADD REPLYlink modified 14 months ago by Ram32k • written 6.8 years ago by Istvan Albert ♦♦ 86k
3

Ditto!  Especially when you need to be able to do this asap because someone's conference is over the weekend and they want some additional graphs!!

Thank you Istvan and Devon for tirelessly helping everybody out!

ADD REPLYlink written 5.9 years ago by Anna S500

Hi, It is an old thread, but I have a naive question about it that I couldn't find anywhere.

if one uses an input from UCSC and do:

makeGRangesFromDataFrame(UCSC_table,seqnames.field ="chr",start.field="Start",end.field="End", ignore.strand=T,starts.in.df.are.0based=TRUE,keep.extra.columns=TRUE)->UCSC_table_GR

If one uses starts.in.df.are.0based=TRUE , is it still necessary to use

starts=start(gr)-1

as explained in your comment?

Thank you

ADD REPLYlink written 6 months ago by docdot0
9
gravatar for vj
7.2 years ago by
vj450
UK
vj450 wrote:

You can try the rtracklayer package. It gives you options to export in various formats including the bed format.

ADD COMMENTlink written 7.2 years ago by vj450

In particular the export() function is what the OP is looking for.

ADD REPLYlink written 4.5 years ago by Giovanni M Dall'Olio27k

If anyone is wondering. Given gr your GRanges object you can produce a bed file this easily:

library(rtracklayer)
export.bed(gr,file='granges.bed')
ADD REPLYlink written 7 months ago by benformatics2.1k
0
gravatar for Zhilong Jia
5.5 years ago by
Zhilong Jia1.6k
London
Zhilong Jia1.6k wrote:

get a DataFrame object by mcols(gr) and then write out.

ref: The GenomicRanges vignette

ADD COMMENTlink modified 14 months ago by Ram32k • written 5.5 years ago by Zhilong Jia1.6k

mcols() gives you the extra (meta) columns, but not the coordinates, which are what's really needed for a BED file. In fact the minimal BED file representation of GRanges object doesn't require any of those columns.

ADD REPLYlink written 5.5 years ago by Devon Ryan98k

You're right. The result of `mcols()` actually give me what I need instead of bed file. I forget the author's question. Thank you.

ADD REPLYlink written 5.5 years ago by Zhilong Jia1.6k

You may need to substract 1 from the start coordinates.

ADD REPLYlink written 4.5 years ago by Giovanni M Dall'Olio27k
0
gravatar for endrebak
4.5 years ago by
endrebak850
github.com/endrebak
endrebak850 wrote:

If you only have one metadata column and you would like to keep it, this modification of Devon Ryan's answer works:

df <- data.frame(seqnames=seqnames(gr),
starts=start(gr)-1,
ends=end(gr),
names=c(rep(".", length(gr))),
scores=elementMetadata(gr)[,1],
strands=strand(gr)

For my data this gives:

  seqnames   starts     ends names scores strands
1     chrY 10515750 10515760     .      1       *
2     chrY 10519610 10519620     .      1       *
3     chrY 10534770 10534780     .      1       *
4     chrY 10540160 10540170     .      1       *
5     chrY 10554860 10554870     .      1       *
6     chrY 10560630 10560640     .      1       *
ADD COMMENTlink written 4.5 years ago by endrebak850
0
gravatar for cjgunase
4 weeks ago by
cjgunase30
United States
cjgunase30 wrote:

This worked for me.

BiocManager::install("Repitools")

library('Repitools')

df <- annoGR2DF(gr)
ADD COMMENTlink modified 4 weeks ago by rpolicastro4.0k • written 4 weeks ago by cjgunase30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 964 users visited in the last hour
_