Question: How To Get A List Of All The Genes On The Human Chromosome Y
5
gravatar for Fred Fleche
8.8 years ago by
Fred Fleche4.3k
Paris, France
Fred Fleche4.3k wrote:

A colleague of mine was asking for a list of all the genes of the human chromosome Y.

I did it through our internal gene annotated database but I was wondering what would be your favorite/easiest way to perform this task through public resources like UCSC, Ensembl or Biomart (coding solutions or clicking throuh web interfaces).

The information needed for the listing would be the Gene Symbol and the Entrez Geneid

The accepted response will be the one with highest ranking on friday noon (french time).

ADD COMMENTlink modified 8.8 years ago by Nathan Harmston1.1k • written 8.8 years ago by Fred Fleche4.3k
9
gravatar for Nathan Harmston
8.8 years ago by
Nathan Harmston1.1k
London
Nathan Harmston1.1k wrote:

I'd use BioMart ... its incredibly easy and since you only need a very specific form of data its pretty easy to do it with just point and click, no coding required.

Took me less than 2 minutes to get the data.

ADD COMMENTlink modified 8.8 years ago by Fred Fleche4.3k • written 8.8 years ago by Nathan Harmston1.1k
2

Yes, I was about to post the same thing- use Biomart on the Ensembl Webpage, choose Ensembl Genes and Homo sapiens, apply a filter for Chromosome Y and click on results. You're done.

ADD REPLYlink written 8.8 years ago by Michael Schubert6.9k
2

The other very cool think about BioMart is that you can use the URL button to store the query and give it to the requestor--they can then load that and re-run it themselves, look at other choices, try again with other regions, etc. Somewhat like "session" at UCSC.

ADD REPLYlink written 8.8 years ago by Mary11k
1

from Attributes remember to check EntrezGene ID and HGNC symbol in the External box. If only all data was this easy to access.

ADD REPLYlink written 8.8 years ago by Nathan Harmston1.1k
1

well in that case ....

BioMart query

ADD REPLYlink modified 19 days ago by RamRS24k • written 8.8 years ago by Nathan Harmston1.1k

didn't realise that...........

BioMart query

ADD REPLYlink modified 19 days ago by RamRS24k • written 8.8 years ago by Nathan Harmston1.1k

I don't know if you check the result but via Biomart it display TSPY1 and TSPY2 with the same Gene ID 64591 and it is not the only case. ?!?

ADD REPLYlink written 8.8 years ago by Fred Fleche4.3k

fair enough - I didn't check the data. I don't know why thats the case - I love bioinformatics databases ^^

ADD REPLYlink written 8.8 years ago by Nathan Harmston1.1k
7
gravatar for Pierre Lindenbaum
8.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

Using the UCSC mysql server and the tables knownGene, knownToLocusLink and refLink:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19  -e '
select distinct
L.name,
L.locusLinkId
from
knownGene as G,
knownToLocusLink as K2L,
refLink as L
where G.name=K2L.name and
K2L.value=L.locusLinkId and
G.chrom="chrY" '

+-------------+-------------+
| name        | locusLinkId |
+-------------+-------------+
| PLCXD1      |       55344 | 
| GTPBP6      |        8225 | 
| NCRNA00107  |      283981 | 
| PPP2R3B     |       28227 | 
| SHOX        |        6473 | 
| CRLF2       |       64109 | 
| CSF2RA      |        1438 | 
| IL3RA       |        3563 | 
| SLC25A6     |         293 | 
| ASMTL-AS    |       80161 | 
| ASMTL       |        8623 | 
| P2RY8       |      286530 | 
| AKAP17A     |        8227 | 
| ASMT        |         438 | 
| ZBED1       |        9189 | 
| DHRSX       |      207063 | 
| CD99        |        4267 | 
| XGPY2       |   100132596 | 
| SRY         |        6736 | 
| RPS4Y1      |        6192 | 
| ZFY         |        7544 | 
| TGIF2LY     |       90655 | 
| PCDH11Y     |       83259 | 
| TTTY23      |      252955 | 
| TSPY2       |       64591 | 
| TTTY1B      |   100101116 | 
| TTTY2       |       60439 | 
| TTTY21      |      252953 | 
| TTTY7       |      246122 | 
| TTTY8       |       84673 | 
| AMELY       |         266 | 
| TBL1Y       |       90665 | 
| PRKY        |        5616 | 
| TTTY16      |      252948 | 
| TTTY12      |       83867 | 
| TTTY18      |      252950 | 
| TTTY19      |      252952 | 
| TTTY11      |       83866 | 
| RBMY1A3P    |      286557 | 
| TTTY20      |      252951 | 
| TSPY4       |      728395 | 
| FAM197Y2P   |      252946 | 
| TSPY3       |      728137 | 
| TSPY1       |        7258 | 
| RBMY3AP     |       64593 | 
| TTTY22      |      252954 | 
| TTTY15      |       64595 | 
| USP9Y       |        8287 | 
| DDX3Y       |        8653 | 
| UTY         |        7404 | 
| TMSB4Y      |        9087 | 
| VCY         |        9084 | 
| NLGN4Y      |       22829 | 
| FAM41AY1    |      340618 | 
| NCRNA00230B |      401629 | 
| XKRY2       |      353515 | 
| CDY2B       |      203611 | 
| XKRY        |        9082 | 
| HSFY2       |      159119 | 
| TTTY9B      |      425057 | 
| NCRNA00185  |       55410 | 
| CD24        |   100133941 | 
| TTTY14      |       83869 | 
| BCORP1      |      286554 | 
| CYorf15A    |      246126 | 
| CYorf15B    |       84663 | 
| KDM5D       |        8284 | 
| TTTY10      |      246119 | 
| EIF1AY      |        9086 | 
| RPS4Y2      |      140032 | 
| RBMY2EP     |      159125 | 
| RBMY1A1     |        5940 | 
| TTTY13      |       83868 | 
| RBMY1B      |      378948 | 
| PRY2        |      442862 | 
| TTTY6       |       84672 | 
| TTTY6B      |      441543 | 
| RBMY1E      |      378950 | 
| RBMY1J      |      378951 | 
| TTTY5       |       83863 | 
| RBMY2FP     |      159162 | 
| RBMY1F      |      159163 | 
| TTTY17B     |      474151 | 
| TTTY4C      |      474150 | 
| BPY2        |        9083 | 
| DAZ1        |        1617 | 
| DAZ2        |       57055 | 
| DAZ3        |       57054 | 
| TTTY3B      |      474148 | 
| CDY1        |        9085 | 
| CDY1B       |      253175 | 
| CSPG4P2Y    |       84664 | 
| GOLGA2P3Y   |      401634 | 
| TTTY17A     |      252949 | 
| DAZ4        |       57135 | 
| SPRY3       |       10251 | 
| VAMP7       |        6845 | 
| IL9R        |        3581 | 
+-------------+-------------+
ADD COMMENTlink modified 19 days ago by RamRS24k • written 8.8 years ago by Pierre Lindenbaum122k

I choose Pierre's response because I am more confident with the result we get with his query. Indeed via Biomart it display TSPY1 and TSPY2 with the same Gene ID 64591 and it is not the only case.

ADD REPLYlink written 8.8 years ago by Fred Fleche4.3k
7
gravatar for Mary
8.8 years ago by
Mary11k
Boston MA area
Mary11k wrote:

UCSC Table Browser would be my query of choice. In fact, at ASHG and in our workshops I usually describe the table browser as the way to get "lists of things" all the time. All the snps in your gene of interest, all the genes in a region, etc.

Note: I'll use the previous assembly for this query (hg18) because I haven't moved over to the new one for most things yet. I find a lot of what I need is still not there.

Choices by row:

  • mammal/human/Mar06
  • Genes+Predictions/ucsc genes
  • knowngene table
  • Region radio button on position, enter chrY. Click "lookup" and it will paste the range in.
  • Output format = selected fields primary + related. Get output button.

On the next page make the choices for the IDs you want. I've done: chrom, txStart, txEnd, from knowngenes table. I added kgID, swissprotID, genesymbol, refseqID, and description (because they always want description even though they don't say this...) from hg18.kgXref fields. I added acc and gi from hg18.gbCdnaInfo fields to get GenBank/EMBL accession IDs.

I'm sure I've overdone the IDs, but I like to use them as sort of internal qc checks for myself. Easily killed in the excel doc that will come next (yes, I know y'all hate excel, but it's what they want).

I've saved it as a session. I think it will store the choices. You can try to load this session and see:

From here you can click the navigation for table browser, and just move to "get output" to see my choices.

ADD COMMENTlink modified 19 days ago by RamRS24k • written 8.8 years ago by Mary11k
4
gravatar for Neilfws
8.8 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

As others have answered, both UCSC tables and BioMart make this very simple. Since there are no coding solutions yet, here is one that uses BioMart via the R/Bioconductor biomaRt library:

library(biomaRt)
mart <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
results <- getBM(attributes = c("chromosome_name", "entrezgene", "hgnc_symbol"),
           filters = "chromosome_name", values = "Y", mart = mart)
# count genes
dim(results)
# [1] 120   3
# list first few rows
head(results)
# chromosome_name entrezgene hgnc_symbol
# 1               Y       6736         SRY
# 2               Y       6192      RPS4Y1
# 3               Y         NA      RPS4Y1
# 4               Y       7544         ZFY
# 5               Y         NA         ZFY
# 6               Y      90655     TGIF2LY
ADD COMMENTlink modified 19 days ago by RamRS24k • written 8.8 years ago by Neilfws48k

Thanks for this coding solution using Biomart. I will try to do one using pure SQL query like Pierre did for UCSC

ADD REPLYlink written 8.8 years ago by Fred Fleche4.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1200 users visited in the last hour