Doc/tutorial UCSC MySQL
1
0
Entering edit mode
8.3 years ago
user31888 ▴ 130

I am trying to get the chromosome position of a query gene using UCSC mysql commands (in hg19 genome).

I cannot find any documentation or tutorial about the syntax of the program to be used.

Any idea?

mysql UCSC-genome-browser • 4.6k views
ADD COMMENT
6
Entering edit mode
8.3 years ago

Example:

$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -N -e "SELECT k.chrom, kg.txStart, kg.txEnd, x.geneSymbol FROM knownCanonical k, knownGene kg, kgXref x WHERE k.transcript = x.kgID AND k.transcript = kg.name AND x.geneSymbol LIKE 'CTCF';" > CTCF.bed

The chromosome and positions of CTCF in hg19 are in the first three columns of the unsorted BED file result.

A gene can have multiple transcripts, so you can get more than one record for a given HGNC gene name.

This result relies on three tables in the UCSC Genome Browser for database hg19 called: knownCanonical, knownGene and kgXref.

The schema of knownCanonical is located here: http://genome.ucsc.edu/goldenpath/gbdDescriptionsOld.html#KnownCanonical

The schema of knownGene is located here: http://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=genes&hgta_track=knownGene&hgta_table=knownGene&hgta_doSchema=describe+table+schema

Likewise, the schema of kgXref is located here: http://genome.ucsc.edu/goldenpath/gbdDescriptionsOld.html#KgXref

The rest is just a SQL query based on the schemas of the three tables, along with database and host parameters that are specific to UCSC.

Part of the "magic" is knowing what tables and fields to use. This comes from experience with the Genome Browser and exploring the links to schemes that are usually available from the table description pages on the Genome Browser site, as well as scouring discussion threads and asking UCSC mailing lists directly, when that information is difficult to find, or seems to be unavailable.

ADD COMMENT
0
Entering edit mode

Thanks Alex !

Do you know where I could find any documentation about the syntax of the command you used (specially for the '-e' argument)?

ADD REPLY
0
Entering edit mode

Please see the edit.

ADD REPLY
0
Entering edit mode

Thanks Alex for the links!

But the command line doesn't work (it is running indefinitely).

ADD REPLY
0
Entering edit mode

Works for me. Maybe their server is slow at the moment? You might check with the UCSC Genome Browser mailing list.

ADD REPLY

Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6