List of genes that are uniquely found in human (not in mouse)
2
0
Entering edit mode
16 months ago
cwwong13 ▴ 40

I am looking for a list of genes that can only be found in humans. However, the closest article dated back to 2004 (https://www.nature.com/articles/nature01262). Within this article, they suggested there are 118 genes but did not provide the list.

May I know if there is any update on this search and where to find the list of genes?

Thanks!

orthologue genome gene • 658 views
ADD COMMENT
1
Entering edit mode
16 months ago
ATpoint 81k

These comparative analysis are not my field, but "naively" if you use existing resources and do simple filtering, then you can use biomaRt from Ensembl. Essentially, they provide homology tables between mouse and human. Hence, if you retrieve all human genes, and the human-to-mouse homology table, then the human-only genes would be those human genes not found in the homology table, right? Again, this is a naive solution, it is not my field, I do not guarantee for integrity of the results, but it might be a start.

library(biomaRt)

human_mart <- biomaRt::useEnsembl("genes", dataset="hsapiens_gene_ensembl", version=100)
mouse_mart <- biomaRt::useEnsembl("genes", dataset="mmusculus_gene_ensembl", version=100)

# All human genes
all_human <- biomaRt::getBM(attributes=c("ensembl_gene_id", "hgnc_symbol", "gene_biotype"),
                            mart=human_mart)

# All homologs for human in mouse
human2mouse_homologs <-
  biomaRt::getLDS(attributes=c("ensembl_gene_id", "hgnc_symbol"), 
                  attributesL=c("ensembl_gene_id", "mgi_symbol"), 
                  mart=human_mart, 
                  martL=mouse_mart,
                  uniqueRows=TRUE)

colnames(human2mouse_homologs) <- c("human_id", "human_name", "mouse_id", "mouse_name")

# Filter the ones that are present in human but not in the homolog table = only present in human
only_human <- 
  all_human[!all_human$ensembl_gene_id %in% human2mouse_homologs$human_id,]

head(only_human)
#>   ensembl_gene_id hgnc_symbol gene_biotype
#> 1 ENSG00000210049       MT-TF      Mt_tRNA
#> 2 ENSG00000211459     MT-RNR1      Mt_rRNA
#> 3 ENSG00000210077       MT-TV      Mt_tRNA
#> 4 ENSG00000210082     MT-RNR2      Mt_rRNA
#> 5 ENSG00000209082      MT-TL1      Mt_tRNA
#> 7 ENSG00000210100       MT-TI      Mt_tRNA

table(only_human$gene_biotype)
#> 
#>                          IG_C_gene                    IG_C_pseudogene 
#>                                 18                                 11 
#>                          IG_D_gene                          IG_J_gene 
#>                                 64                                 24 
#>                    IG_J_pseudogene                      IG_pseudogene 
#>                                  6                                  1 
#>                          IG_V_gene                    IG_V_pseudogene 
#>                                153                                290 
#>                             lncRNA                              miRNA 
#>                              17957                               1699 
#>                           misc_RNA                            Mt_rRNA 
#>                               2186                                  2 
#>                            Mt_tRNA             polymorphic_pseudogene 
#>                                 22                                 42 
#>               processed_pseudogene                     protein_coding 
#>                              10830                               4843 
#>                         pseudogene                           ribozyme 
#>                                 40                                  5 
#>                               rRNA                    rRNA_pseudogene 
#>                                 55                                517 
#>                             scaRNA                              scRNA 
#>                                 31                                  1 
#>                             snoRNA                              snRNA 
#>                                561                               1453 
#>                               sRNA                                TEC 
#>                                  6                               1118 
#>                          TR_C_gene                          TR_D_gene 
#>                                  5                                  5 
#>                          TR_J_gene                    TR_J_pseudogene 
#>                                 93                                  4 
#>                          TR_V_gene                    TR_V_pseudogene 
#>                                110                                 46 
#>   transcribed_processed_pseudogene     transcribed_unitary_pseudogene 
#>                                562                                142 
#> transcribed_unprocessed_pseudogene    translated_processed_pseudogene 
#>                               1097                                  2 
#>  translated_unprocessed_pseudogene                 unitary_pseudogene 
#>                                  1                                104 
#>             unprocessed_pseudogene                           vaultRNA 
#>                               3362                                  1
Created on 2022-11-29 with reprex v2.0.2
ADD COMMENT
1
Entering edit mode
16 months ago
GenoMax 141k

There is also a list of human/mouse genes that Jackson labs makes available here. You can parse this file to find the differences

DB Class Key    Common Organism Name    NCBI Taxon ID   Symbol  EntrezGene ID   Mouse MGI ID    HGNC ID OMIM Gene ID    Genetic Location
42532301        mouse, laboratory       10090   Klb     83379   MGI:1932466                     Chr5 33.64 cM   Chr5:65505657-6554135>
42532301        human   9606    KLB     152831          HGNC:15527      OMIM:611135     Chr4 p14        Chr4:39406853-39451533(+)    >
42532302        mouse, laboratory       10090   Oxr1    170719  MGI:2179326                     Chr15 16.06 cM  Chr15:41310878-417244>
42532302        human   9606    OXR1    55074           HGNC:15822      OMIM:605609     Chr8 q23.1      Chr8:106270178-106752694(+)  >
42532303        mouse, laboratory       10090   Cma2    545055  MGI:88426                       Chr14 28.19 cM  Chr14:56188437-562114>
42532304        mouse, laboratory       10090   Mcpt9   17232   MGI:1194491                     Chr14 28.19 cM  Chr14:56264321-562679>
42532305        mouse, laboratory       10090   Mcptl   17233   MGI:102792                      Chr14 syntenic   

I see three genes here that are not in humans (or so it seems, Cma2, Mcpt9 and Mcptl) but I guess you are looking for opposite so not sure if there are some.

ADD COMMENT

Login before adding your answer.

Traffic: 2211 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6