Get Gene families from Mus musculus
1
0
Entering edit mode
4.2 years ago
JulianC ▴ 30

Hi! I need to have a table of all gene families from Mus musculus with the relative genes that compose each of them. For human it is very simple with HGNC database (https://biomart.genenames.org/martform/#!/default/HGNC?datasets=hgnc_family_mart). For mouse I found the MGI database but I am not able to identify specifically the total number of gene families and the corresponding genes. Do you have a solution for that? Thank you!

gene families • 1.2k views
ADD COMMENT
2
Entering edit mode
4.2 years ago

Not sure if this is what you need (to which gene families are you referring?), but you can generate an annotation table in R:

require(biomaRt)

mart <- useMart('ENSEMBL_MART_ENSEMBL', host = 'useast.ensembl.org')
mart <- useDataset('mmusculus_gene_ensembl', mart)

annotLookup <- getBM(
  mart = mart,
  attributes = c(
    'mgi_symbol',
    'ensembl_gene_id',
    'entrezgene_id',
    'gene_biotype',
    'family',
    'family_description',
    'superfamily',
    'wikigene_description'))

head(annotLookup)
  mgi_symbol    ensembl_gene_id entrezgene_id   gene_biotype         family
1     mt-Nd1 ENSMUSG00000064341         17716 protein_coding  PTHR11432_SF3
2     mt-Nd2 ENSMUSG00000064345         17717 protein_coding PTHR22773_SF41
3     mt-Co1 ENSMUSG00000064351         17708 protein_coding      PTHR10422
4     mt-Co2 ENSMUSG00000064354         17709 protein_coding      PTHR22888
5     mt-Co2 ENSMUSG00000064354         17709 protein_coding      PTHR22888
6    mt-Atp8 ENSMUSG00000064356         17706 protein_coding      PTHR13722
                                                              family_description
1 NADH UBIQUINONE OXIDOREDUCTASE CHAIN 1 EC_7.1.1.2 NADH DEHYDROGENASE SUBUNIT 1
2 NADH UBIQUINONE OXIDOREDUCTASE CHAIN 2 EC_7.1.1.2 NADH DEHYDROGENASE SUBUNIT 2
3   CYTOCHROME C OXIDASE SUBUNIT 1 EC_1.9.3.1 CYTOCHROME C OXIDASE POLYPEPTIDE I
4             CYTOCHROME C OXIDASE SUBUNIT 2 CYTOCHROME C OXIDASE POLYPEPTIDE II
5             CYTOCHROME C OXIDASE SUBUNIT 2 CYTOCHROME C OXIDASE POLYPEPTIDE II
6                                                                 ATP SYNTHASE 8
  superfamily            wikigene_description
1                NADH dehydrogenase subunit 1
2                NADH dehydrogenase subunit 2
3    SSF81442  cytochrome c oxidase subunit I
4    SSF49503 cytochrome c oxidase subunit II
5    SSF81464 cytochrome c oxidase subunit II
6                   ATP synthase F0 subunit 8

Then, write it out and date-stamp the filename:

write.table(
  annotLookup,
  paste0('Mouse_', gsub("-", "_", as.character(Sys.Date())), '.tsv'),
  sep = '\t',
  row.names = FALSE,
  quote = FALSE)

Kevin

ADD COMMENT
0
Entering edit mode

Thank you Kevin! I am just looking for all the gene families in the mouse for some analyses on them. I will try with your suggestion!

ADD REPLY
0
Entering edit mode

Okay, let me know if this solution helps you.

ADD REPLY
0
Entering edit mode

Hi Kevin, I have been trying to find families for my analysis but it seems biomart no longer has the family as an attribute. Do you have any idea about such updates?

ADD REPLY

Login before adding your answer.

Traffic: 2350 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6