Question: Error Import File With "Import" Of Rtracklayer
0
gravatar for Kavitha
7.6 years ago by
Kavitha0
Kavitha0 wrote:

Hi, This is related to a the question found here (http://biostar.stackexchange.com/questions/3414?sort=newest#sort-top). I am trying to import a gff file using import("file.gff") which contains a set of chromosomal positions for all chrmosomes in mouse. When I use the rtracklayer command

genes = import("file.gff")  
names(genes)= "chr1" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chrX" "chrY"

How do I make sure that the import doesn't cause the chromosomal order to change I want it to import it into teh dataRange genes in the order

"chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chrX" "chrY"

Thanks in Advance

R next-gen chip-seq sequencing • 1.9k views
ADD COMMENTlink modified 4.6 years ago by Biostar ♦♦ 20 • written 7.6 years ago by Kavitha0

The names are sorted in lexical order, however the order of chromosome names is irrelevant for further analysis e.g. overlap computation in GenomicRanges/IRanges. The other way around that also makes sense, if the order of names appearing makes any difference to your analysis, then there is possibly something wrong with your analysis.

ADD REPLYlink written 7.6 years ago by Michael Dondrup46k
1
gravatar for Neilfws
7.6 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

I don't think the order in which the data are imported has any bearing on your problem.

First, I assume that rtracklayer imports entries in the order in which they appear in the GFF file.

Second, I think you are confused by the output of names(genes). In fact, the chromosomes are sorted in that output. However, the sorting is by the ASCII value of the characters in the strings, not by chromosome number - hence "chr10" comes after "chr1" but before "chr11". This is a quite normal way to sort strings and tells you nothing about how the data are stored.

ADD COMMENTlink written 7.6 years ago by Neilfws48k

Hi, Well I thought so too at first. But then to check if this is what was happening. I extracted say genes$chr2 from the .gff file and the reads I had from the bam file for chr2 and I computed their overlaps. It gave me the correct overlap.

ADD REPLYlink written 7.6 years ago by Kavitha0

I checked if the gff file was imported in that order (using the command spaces(genes)) and it was. The file.gff as such has the correct order .

ADD REPLYlink written 7.6 years ago by Kavitha0

OK, so we're agreed that any issues you have with rtracklayer don't relate to chromosome order? I don't really understand your problem.

ADD REPLYlink written 7.6 years ago by Neilfws48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1108 users visited in the last hour