Error Import File With "Import" Of Rtracklayer
1
0
Entering edit mode
10.3 years ago
ska007 • 0

Hi, This is related to a the question found here (http://biostar.stackexchange.com/questions/3414?sort=newest#sort-top). I am trying to import a gff file using import("file.gff") which contains a set of chromosomal positions for all chrmosomes in mouse. When I use the rtracklayer command

genes = import("file.gff")
names(genes)= "chr1" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chrX" "chrY"


How do I make sure that the import doesn't cause the chromosomal order to change I want it to import it into teh dataRange genes in the order

"chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chrX" "chrY"


chip-seq next-gen sequencing r • 2.4k views
0
Entering edit mode

The names are sorted in lexical order, however the order of chromosome names is irrelevant for further analysis e.g. overlap computation in GenomicRanges/IRanges. The other way around that also makes sense, if the order of names appearing makes any difference to your analysis, then there is possibly something wrong with your analysis.

1
Entering edit mode
10.3 years ago
Neilfws 49k

I don't think the order in which the data are imported has any bearing on your problem.

First, I assume that rtracklayer imports entries in the order in which they appear in the GFF file.

Second, I think you are confused by the output of names(genes). In fact, the chromosomes are sorted in that output. However, the sorting is by the ASCII value of the characters in the strings, not by chromosome number - hence "chr10" comes after "chr1" but before "chr11". This is a quite normal way to sort strings and tells you nothing about how the data are stored.

0
Entering edit mode

Hi, Well I thought so too at first. But then to check if this is what was happening. I extracted say genes\$chr2 from the .gff file and the reads I had from the bam file for chr2 and I computed their overlaps. It gave me the correct overlap.

0
Entering edit mode

I checked if the gff file was imported in that order (using the command spaces(genes)) and it was. The file.gff as such has the correct order .

0
Entering edit mode

OK, so we're agreed that any issues you have with rtracklayer don't relate to chromosome order? I don't really understand your problem.