Homology of 3' UTRs
1
0
Entering edit mode
4.0 years ago

I am looking to identify the mouse homologs of human UTRs and measure the sequence conservation between the two.

Ensembl lists the homolgous protein relationships, but no the transcripts.

The best I can think of at the moment is:

  1. Identify protein homology pairs.
  2. Identify the mouse and human transcripts encoding each of the pairs.
  3. Extract the UTR sequences
  4. Do pairwise alignment to assess the match.

Can anyone think of a better idea? This sounds like a lot of work to do genomewide.

UTR homology • 1.0k views
ADD COMMENT
0
Entering edit mode

Complete list of human/mouse homologs is available from MGI.

ADD REPLY
0
Entering edit mode

This is the protein homologs, not the transcript homologs? So to get the homology of the UTRs, i'd need to do what I outlined above.

ADD REPLY
0
Entering edit mode

There are nucleotide and protein RefSeq ID's for mouse/human for each gene (they are likely not every transcript isoform that is out there but a good start).

ADD REPLY
0
Entering edit mode

Stupid me! I was only look at one line at a time, and only seeing refseq nucleotide ids for one species!

ADD REPLY
0
Entering edit mode
4.0 years ago
ATpoint 82k

Wouldn't it be easiest to simply pull a list of homolog transcripts between mouse and human from Ensembl?

library(biomaRt)

Mouse2Human <- function(MouseTx){

  human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
  mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")

  txMouse2Human = getLDS(attributes = c("ensembl_transcript_id"), 
                         filters = "ensembl_transcript_id", 
                         values = MouseTx , 
                         mart = mouse, 
                         attributesL = c("ensembl_transcript_id"), 
                         martL = human, 
                         uniqueRows = TRUE)

  colnames(txMouse2Human) <- c("Mouse_Tx", "Human_Tx")

  return(txMouse2Human) 

}

## Manually collect mouse and human genes from Ensembl
musmusculus_tx <- getBM(attributes = c("ensembl_transcript_id"),  
                        mart = useMart("ensembl", dataset = "mmusculus_gene_ensembl"))

Mouse2HumanTable <- Mouse2Human(MouseTx = musmusculus_tx$ensembl_transcript_id)

This should get you:

> head(Mouse2HumanTable)
            Mouse_Tx        Human_Tx
1 ENSMUST00000082405 ENST00000361739
2 ENSMUST00000110020 ENST00000555699
3 ENSMUST00000110020 ENST00000334869
4 ENSMUST00000110020 ENST00000555169
5 ENSMUST00000110020 ENST00000557434
6 ENSMUST00000110020 ENST00000393218

and from this you could then filter the UTRs of the respective transcripts out of the GTF files. Hope I got you right.

ADD COMMENT
0
Entering edit mode

I looked at this on the biomart website and although I selected transcripts_id it gave me protein id. I guess the biomaRt is working better.

ADD REPLY

Login before adding your answer.

Traffic: 2956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6