Problem using GenomicRanges package
1
0
Entering edit mode
3.0 years ago

Hello,

I am trying to run the GRanges package after reading in a .csv file in Rstudio, I believe it is looking at methylation positions across chromosomes. The code was provided to me and is as follows:

islandHMM <- read.csv(paste0("C:/Users/Tristanv/Documents/RStudio/model-based-cpg-islands-hg19-chr17.txt"), header=FALSE, sep="\t", stringsAsFactors=FALSE)
islandData <- GRanges(seqnames=Rle(islandHMM[,1]), ranges=IRanges(start=islandHMM[,2], end=islandHMM[,3]), strand=Rle(strand(rep("*",nrow(islandHMM)))))

The csv file reads correctly and looks like the attached (first six rows shown).

data frame

The second part of the code concerning GRanges gives an error:

Error in .normargSEW0(start, "start") :    'start' must be a numeric vector (or NULL)

I am not sure how I would go about fixing this, I am fairly new to R.

Thanks in advance for and help.

R GenomicRanges • 1.2k views
ADD COMMENT
0
Entering edit mode

Please provide output of dput(head(islandHMM, 20)) so one can have something to reproduce/provide code on.

ADD REPLY
0
Entering edit mode

That gives the following output (this is after the first entry in columns V2 to V7 has been turned into a double, so the first in each column is now NA):

structure(list(V1 = c("chr", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10"), V2 = c(NA, 93098, 94002, 94527, 119652, 122133, 180265, 180865, 242994, 313778, 319183, 321809, 323315, 323625, 324003, 327172, 334493, 346533, 348952, 357295), V3 = c(NA, 93818, 94165, 95302, 120193, 122621, 180720, 182549, 243152, 313905, 319290, 321998, 323465, 323770, 324088, 327614, 334801, 346829, 349176, 357450), V4 = c(NA, 721, 164, 776, 542, 489, 456, 1685, 159, 128, 108, 190, 151, 146, 86, 443, 309, 297, 225, 156), V5 = c(NA, 32, 12, 65, 53, 51, 32, 230, 10, 6, 8, 10, 11, 11, 7, 24, 14, 30, 16, 12), V6 = c(NA, 403, 97, 538, 369, 339, 256, 1263, 74, 64, 68, 113, 92, 87, 60, 248, 171, 210, 105, 106), V7 = c(NA, 0.559, 0.591, 0.693, 0.681, 0.693, 0.561, 0.75, 0.465, 0.5, 0.63, 0.595, 0.609, 0.596, 0.698, 0.56, 0.553, 0.707, 0.467, 0.679), V8 = c("obsExp", "0.572", "0.841", "0.702", "0.866", "0.88", "0.893", "0.984", "1.193", "0.769", "0.75", "0.596", "0.791", "0.991", "0.672", "0.692", "0.594", "0.833", "1.326", "0.667" )), row.names = c(NA, 20L), class = "data.frame")

ADD REPLY
1
Entering edit mode
3.0 years ago
bernatgel ★ 3.4k

Hi Tristan,

The problem you have is that your tell read.csv that your file has no header (header=FALSE) but it has a header, and so the first data line in your data.frame is all characters and forces all columns to be characters. To solve that, simply change header=FALSE to header=TRUE and everything should work.

Another option would be to use toGRanges from the package regioneR

library(regioneR)
islandHMM <- toGRanges(""C:/Users/Tristanv/Documents/RStudio/model-based-cpg-islands-hg19-chr17.txt")

and that should do it too.

ADD COMMENT
0
Entering edit mode

Thanks, that seems to have fixed it :)

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6