Interpreting liftOver output to extract orthologous exons between distantly related species.
0
0
Entering edit mode
5.1 years ago
fmerkal ▴ 60

Hello everyone,

I am trying to use RNAseq data to analyze differential exon usage between two distantly related species (mouse vs chick). The problem I have encountered has to do with extracting orthologous exons between mouse and chick. When comparing the splicing profiles between these two species, I want to be sure that I am only comparing orthologous exons since the exon composition of, even, orthologous genes can differ.

I tried using the liftOver tool, in R, through the package rtracklayer to achieve this. I essentially used the galGal6 (chicken genome) BED and galGal6ToMm10.over.chain files, from UCSC, as the input for the liftOver function. The code I used is written below.

library(rtracklayer)
library(GenomicRanges)
library(readr)
gg6 <- read_delim("gg6.bed", "\t", escape_double = FALSE, col_names = FALSE, trim_ws = TRUE)
colnames(gg6) [1:12] <-c("chrom","start","end","name","score","strand","thickStart","thickEnd","itemRgb","blockCount","blockSizes","blockStarts")
grObject<-GRanges(as.data.frame(gg6))
galGal6ToMm10_over <- import.chain("galGal6ToMm10.over.chain")
results <- as.data.frame(liftOver(grObject, galGal6ToMm10_over))

I am having trouble interpreting the results output. I can see that the gene ids are repeated many times. At first I thought that they correspond to the exons of that gene but after looking closer it seems like they are repeated hundreds of times. I understand that for every input element there might be multiple ranges mapped but I am not sure how to know exactly which exons are orthologous.

The results output looks like

    group   group_name   seqnames   start   end   width   strand   name   score   thickStart   thickEnd   itemRgb   blockCount   blockSizes   blockStarts
1   NA   chr1   38170035   38170046   12   -   XM_015277791.2   0   133973566   134293920   0   21   1.018222e+55 0,11476,11830,14835,17031,19160,21996,33945,36262,39713,42259,50915,80028,95797,126086,131334,137738,180905,277249,282605,330361,

Have you guys ever dealt with liftOver files? Any help would be greatly appreciated :)

rna-seq R genome • 964 views
ADD COMMENT

Login before adding your answer.

Traffic: 2863 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6