Question: reading a bed file in R
0
gravatar for A
5.1 years ago by
A3.9k
A3.9k wrote:

 

hi,

i have a bed file like below

FLP1    252    1523
RAF1    3271    3816
REP1    1887    3008
REP2    5308    6198
i was reading the file in R 

bed <- as.data.frame(read.table("file.bed",header = FALSE, sep="\t",stringsAsFactors=FALSE))
Warning messages:
1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  EOF within quoted string
2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  number of items read is not a multiple of the number of columns

but when i read file as csv no error anymore

how i can read this file as table please?

thank you

R myposts bed • 25k views
ADD COMMENTlink modified 5.1 years ago by Giovanni M Dall'Olio27k • written 5.1 years ago by A3.9k
4
gravatar for spleonard1
5.1 years ago by
spleonard140
United States
spleonard140 wrote:

Maybe you have a weird hidden character in there somewhere. try adding

quote = ""

to your initial read.table call, like:

bed <- as.data.frame(read.table("file.bed",header = FALSE, sep="\t",stringsAsFactors=FALSE, quote=""))
ADD COMMENTlink modified 13 months ago by _r_am32k • written 5.1 years ago by spleonard140
1

Actually, as.data.frame() is not needed: read.table() will already return a data frame.

> bed <- read.table("file.bed",header = FALSE, sep="\t",stringsAsFactors=FALSE, quote="")

> bed
    V1   V2   V3
1 FLP1  252 1523
2 RAF1 3271 3816
3 REP1 1887 3008
4 REP2 5308 6198

> as.data.frame(bed)
    V1   V2   V3
1 FLP1  252 1523
2 RAF1 3271 3816
3 REP1 1887 3008
4 REP2 5308 6198

> identical(bed, as.data.frame(bed))
[1] TRUE
ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by Charles Plessy2.7k

thank you very much

ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by A3.9k

thank you so much

my file was read by your comment

ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by A3.9k
6
gravatar for Giovanni M Dall'Olio
5.1 years ago by
London, UK
Giovanni M Dall'Olio27k wrote:

The most versatile option is to use the import function from rtracklayer:

> source("https://bioconductor.org/biocLite.R")
> biocLite("rtracklayer")
> library(rtracklayer)
> import("example.txt", format="bed")
GRanges object with 9 ranges and 4 metadata columns:
      seqnames                 ranges strand |        name     score
         <Rle>              <IRanges>  <Rle> | <character> <numeric>
  [1]     chr7 [127471197, 127472363]      + |        Pos1         0
  [2]     chr7 [127472364, 127473530]      + |        Pos2         0
  [3]     chr7 [127473531, 127474697]      + |        Pos3         0
  [4]     chr7 [127474698, 127475864]      + |        Pos4         0
  [5]     chr7 [127475865, 127477031]      - |        Neg1         0
  [6]     chr7 [127477032, 127478198]      - |        Neg2         0
  [7]     chr7 [127478199, 127479365]      - |        Neg3         0
  [8]     chr7 [127479366, 127480532]      + |        Pos5         0
  [9]     chr7 [127480533, 127481699]      - |        Neg4         0

The import() function can be used to read most common bioinformatics format: bed, gtf, wig, bigwig, etc..

ADD COMMENTlink modified 13 months ago by _r_am32k • written 5.1 years ago by Giovanni M Dall'Olio27k

thank you

import("footprint.txt", format="bed")
Error: could not find function "import"
ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by A3.9k
1

sorry, you have to load the rtracklayer library first.

ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by Giovanni M Dall'Olio27k

thank you Giovanni

ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by A3.9k

thanks again,

import("footprint.txt", format="bed")
Error in open.connection(con, open) : cannot open the connection
In addition: Warning message:
In open.connection(con, open) :
  cannot open file 'footprint.txt': No such file or directory
ADD REPLYlink modified 13 months ago by _r_am32k • written 5.1 years ago by A3.9k

Hi,

I am trying to use TEQC package and getting following error:

sample <- get.targets("C:/sample.bed", chrcol = 1, startcol = 2, endcol = 3, zerobased = TRUE, skip = 1, header = FALSE)

Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
  solving row 560885: negative widths are not allowed

So I used import() from rtracklayer (https://support.bioconductor.org/p/38219/) but when I try to import, it shows error:

sample <- import("C:/sample.bed", format="bed")

Error: logical subscript contains NAs

My bed file contains scaffold names also in addition to chromosome number in the 1st column, so I think this may be the reason.

However, read.table() works fine.

Kindly guide how I can import bed file including scaffold names in 1st column and use TEQC package.

ADD REPLYlink modified 13 months ago by _r_am32k • written 3.7 years ago by bioinfo8120

Sorry I am not good in R and always face error in R and this forum is my only source to get help.

ADD REPLYlink written 3.7 years ago by A3.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1143 users visited in the last hour
_