Question: Error using makeGRangesFromDataFrame in an R workflow for ATAC-seq
0
gravatar for Ethidium Bromide
8 weeks ago by
London
Ethidium Bromide0 wrote:

New to R. Recently working through an ATAC-seq workshow in R.

In Section 5.1 of where I call the makeGRangesFromDataFrame function I get the following error -

> atac_esca.gr <- makeGRangesFromDataFrame(atac_esca,keep.extra.columns = T)
 Error in makeGRangesFromDataFrame(atac_esca, keep.extra.columns = T) : could not find function "makeGRangesFromDataFrame"

I had previously installed the GenomicRanges package and I tried again. No joy. So I attached the package require(GenomicRanges) and now when I receive the following error when I go to repeat the same call -

 > atac_esca.gr <- makeGRangesFromDataFrame(atac_esca,keep.extra.columns = T)
Error in inherits(object, class2) : 'what' must be a character vector

     > dput(head(atac_esca,10))
structure(list(seqnames = c("chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1", "chr1", "chr1"), start = c(1290095, 1291115, 
1291753, 1440824, 1630188, 2030218, 2184484, 2185113, 2185905, 
2186860), end = c(1290596, 1291616, 1292254, 1441325, 1630689, 
2030719, 2184985, 2185614, 2186406, 2187361), name = c("ESCA_107", 
"ESCA_108", "ESCA_109", "ESCA_160", "ESCA_179", "ESCA_341", "ESCA_539", 
"ESCA_540", "ESCA_541", "ESCA_542"), score = c(2.46437811814343, 
2.58792851900195, 7.57996223017863, 4.46727398384637, 20.6213237496952, 
15.5725811237533, 19.2854359599157, 11.1907656456091, 22.2148888990001, 
22.8844119795596), annotation = c("3' UTR", "3' UTR", "3' UTR", 
"3' UTR", "3' UTR", "3' UTR", "3' UTR", "3' UTR", "3' UTR", "3' UTR"
), percentGC = c(0.676646706586826, 0.702594810379242, 0.63872255489022, 
0.658682634730539, 0.728542914171657, 0.6187624750499, 0.578842315369261, 
0.604790419161677, 0.63872255489022, 0.439121756487026), percentAT = c(0.323353293413174, 
0.297405189620758, 0.36127744510978, 0.341317365269461, 0.271457085828343, 
0.3812375249501, 0.421157684630739, 0.395209580838323, 0.36127744510978, 
0.560878243512974)), spec = structure(list(cols = list(seqnames = structure(list(), class = c("collector_character", 
"collector")), start = structure(list(), class = c("collector_double", 
"collector")), end = structure(list(), class = c("collector_double", 
"collector")), name = structure(list(), class = c("collector_character", 
"collector")), score = structure(list(), class = c("collector_double", 
"collector")), annotation = structure(list(), class = c("collector_character", 
"collector")), percentGC = structure(list(), class = c("collector_double", 
"collector")), percentAT = structure(list(), class = c("collector_double", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1), class = "col_spec"), row.names = c(NA, 
10L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))

> head (atac_esca)
  seqnames   start     end     name     score annotation percentGC percentAT
1     chr1 1290095 1290596 ESCA_107  2.464378     3' UTR 0.6766467 0.3233533
2     chr1 1291115 1291616 ESCA_108  2.587929     3' UTR 0.7025948 0.2974052
3     chr1 1291753 1292254 ESCA_109  7.579962     3' UTR 0.6387226 0.3612774
4     chr1 1440824 1441325 ESCA_160  4.467274     3' UTR 0.6586826 0.3413174
5     chr1 1630188 1630689 ESCA_179 20.621324     3' UTR 0.7285429 0.2714571
6     chr1 2030218 2030719 ESCA_341 15.572581     3' UTR 0.6187625 0.3812375

> packageVersion("GenomicRanges")
[1] ‘1.38.0’

> class(atac_esca)
[1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame"

Can anyone tell me what I am doing wrong here?

[Using R Studio 1.2.5033 - R 3.6.3 - Windows 10]

Thanks in advance,

R.

genomicranges R • 106 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by Ethidium Bromide0
2
gravatar for ATpoint
8 weeks ago by
ATpoint35k
Germany
ATpoint35k wrote:

Output of packageVersion("GenomicRanges"), head(atac_esca) and class(atac_esca)? Never seen this error message and I use this function like every day. Could be Windows-related or your data frame is in fact not a df but a different class like any of these dplyr formats. I also suggest to never use T/F but always TRUE/FALSE to avoid misinterpretations. And yes, of course you have to load the package first before being able to call functions from it, GenomicRanges is not a base package so you will have to load it at every new session.

ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by ATpoint35k

Thanks for your reply. edited in the outputs.

ADD REPLYlink written 8 weeks ago by Ethidium Bromide0
1

Can you try again with makeGRangesFromDataFrame(data.frame(atac_esca), keep.extra.columns = TRUE)?

Looks like this is not a standard data frame but a mixture of several other classes.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by ATpoint35k

Thanks!! Well. This certainly produced a different result:

> makeGRangesFromDataFrame(data.frame(atac_esca), keep.extra.columns = TRUE)
GRanges object with 127055 ranges and 5 metadata columns:
           seqnames            ranges strand |        name            score  annotation         percentGC
              <Rle>         <IRanges>  <Rle> | <character>        <numeric> <character>         <numeric>
       [1]     chr1   1290095-1290596      * |    ESCA_107 2.46437811814343      3' UTR 0.676646706586826
       [2]     chr1   1291115-1291616      * |    ESCA_108 2.58792851900195      3' UTR 0.702594810379242
       [3]     chr1   1291753-1292254      * |    ESCA_109 7.57996223017863      3' UTR  0.63872255489022
       [4]     chr1   1440824-1441325      * |    ESCA_160 4.46727398384637      3' UTR 0.658682634730539
       [5]     chr1   1630188-1630689      * |    ESCA_179 20.6213237496952      3' UTR 0.728542914171657
       ...      ...               ...    ... .         ...              ...         ...               ...
  [127051]     chrY 12905438-12905939      * | ESCA_126988 3.60381207684008    Promoter 0.510978043912176
  [127052]     chrY 13479810-13480311      * | ESCA_126990  8.2894261489451    Promoter 0.522954091816367
  [127053]     chrY 19077189-19077690      * | ESCA_126994 2.11891320489898    Promoter 0.620758483033932
  [127054]     chrY 19744492-19744993      * | ESCA_126997 3.99321245552217    Promoter 0.483033932135729
  [127055]     chrY 20575377-20575878      * | ESCA_127001 10.5701261925874    Promoter 0.512974051896208
                   percentAT
                   <numeric>
       [1] 0.323353293413174
       [2] 0.297405189620758
       [3]  0.36127744510978
       [4] 0.341317365269461
       [5] 0.271457085828343
       ...               ...
  [127051] 0.489021956087824
  [127052] 0.477045908183633
  [127053] 0.379241516966068
  [127054] 0.516966067864271
  [127055] 0.487025948103792
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths
ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Ethidium Bromide0
1

Is that the issue solved, then, EtBr?

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Kevin Blighe60k

Yes, it is. How can I mark this solved for future reference?

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Ethidium Bromide0
1

I have moved this thread to answer, so, feel free to accept the answer.

ADD REPLYlink written 8 weeks ago by Kevin Blighe60k
1

Glad to hear it worked.

enter image description here

ADD REPLYlink written 8 weeks ago by ATpoint35k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1842 users visited in the last hour