Question: Interpreting liftOver output to extract orthologous exons between distantly related species.
0
gravatar for fmerkal
19 months ago by
fmerkal30
USA/Lowell/University of Massachusetts Lowell
fmerkal30 wrote:

Hello everyone,

I am trying to use RNAseq data to analyze differential exon usage between two distantly related species (mouse vs chick). The problem I have encountered has to do with extracting orthologous exons between mouse and chick. When comparing the splicing profiles between these two species, I want to be sure that I am only comparing orthologous exons since the exon composition of, even, orthologous genes can differ.

I tried using the liftOver tool, in R, through the package rtracklayer to achieve this. I essentially used the galGal6 (chicken genome) BED and galGal6ToMm10.over.chain files, from UCSC, as the input for the liftOver function. The code I used is written below.

library(rtracklayer)
library(GenomicRanges)
library(readr)
gg6 <- read_delim("gg6.bed", "\t", escape_double = FALSE, col_names = FALSE, trim_ws = TRUE)
colnames(gg6) [1:12] <-c("chrom","start","end","name","score","strand","thickStart","thickEnd","itemRgb","blockCount","blockSizes","blockStarts")
grObject<-GRanges(as.data.frame(gg6))
galGal6ToMm10_over <- import.chain("galGal6ToMm10.over.chain")
results <- as.data.frame(liftOver(grObject, galGal6ToMm10_over))

I am having trouble interpreting the results output. I can see that the gene ids are repeated many times. At first I thought that they correspond to the exons of that gene but after looking closer it seems like they are repeated hundreds of times. I understand that for every input element there might be multiple ranges mapped but I am not sure how to know exactly which exons are orthologous.

The results output looks like

    group   group_name   seqnames   start   end   width   strand   name   score   thickStart   thickEnd   itemRgb   blockCount   blockSizes   blockStarts
1   NA   chr1   38170035   38170046   12   -   XM_015277791.2   0   133973566   134293920   0   21   1.018222e+55 0,11476,11830,14835,17031,19160,21996,33945,36262,39713,42259,50915,80028,95797,126086,131334,137738,180905,277249,282605,330361,

Have you guys ever dealt with liftOver files? Any help would be greatly appreciated :)

rna-seq R genome • 360 views
ADD COMMENTlink modified 15 months ago • written 19 months ago by fmerkal30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1732 users visited in the last hour