Is my use of mapfromtranscripts correct in GenomicFeatures package? converting cDNA coordinates to genomic coordinates with GenomicFeatures
Entering edit mode
14 months ago
anon • 0

Hello, I am trying to convert cDNA coordinates to 3'UTR coordinates using the mapFromTranscripts function from the GenomicFeatures package.

I've defined my ranges and transcript ID:

txcoords_test<- IRanges(start=c(2306),end=c(2329), width=c(24), 
gr_testcoords<-GRanges(seqnames="ENST00000370115", ranges=txcoords_test, strand = "*")

My all_utr3 is a GRanges object from threeUTRsByTranscript (txdb) modified to contain the tx_id, tx_name and gene_id.

txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene #hg19 txdb works as well as has apparently been updated to h38

#try on transcripts 
txdb_transcripts<- transcripts(txdb)
#remove the versions and make names 
mcols(txdb_transcripts)$tx_name<- gsub('.{2}$', '', txdb_transcripts$tx_name) 

#use mapfrom on transcripts
mapFromTranscripts(gr_testcoords, txdb_transcripts)#we get the genomic coords for the cds

#coords for the 3'UTRs
ucsc_utr3<- threeUTRsByTranscript(txdb)

#remove the versions at the end of tx_id and #add names to all_utr3
all_utr3<- unlist(ucsc_utr3, use.names=FALSE)

mcols(all_utr3)$tx_id<-rep(as.integer(names(ucsc_utr3)), lengths(ucsc_utr3))

#all_utr3 doesnt have the transcript information
#add the tx_id, tx_name metadata column
tx2gene<- mcols(transcripts(txdb, columns=c("tx_id","tx_name","gene_id")))

#add tx_name and to all_utr3 by matching tx_id
m<- match(mcols(all_utr3)$tx_id, tx2gene$tx_id)
mcols(all_utr3)<- cbind(mcols(all_utr3),tx2gene[m,-1L, drop=FALSE])

#modify the transcript versions (no version since my transcript names dont have any)
mcols(all_utr3)$tx_name<- gsub('.{2}$', '', all_utr3$tx_name) 

#set the names to the tx names

When I run the mapFromTranscripts, I get back an empty GRanges object.

mapFromTranscripts(gr_testcoords, all_utr3)

However, on subsetting all_utr3 for my transcript ID, the 3'UTR coordinates are present. I'm not sure what I'm missing that will allow my coordinates to map onto the 3'UTR coordinates in all_utr3.

subset(all_utr3, names(all_utr3)=="ENST00000370115")

What I'm expecting for ENST00000370115 is a range within the 3'UTR: 100738869-100738892. (from BioMart, the full 3'UTR is between 100738284-100739038). I'm not sure what I'm missing that will allow my cDNA coordinates to map onto the 3'UTR coordinates in all_utr3.

bioconductor genomicfeatures r rna22 • 325 views

Login before adding your answer.

Traffic: 1664 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6