How To Get "Alternate Gene Name" (Gene_Id) From Ucsc Table Browser?
3
1
Entering edit mode
10.6 years ago
Dan D 7.3k

On the UCSC graphical genome browser, the "alternate gene names" are shown, like in the picture below:

If I use the Table Browser to get RefSeq genes:

I see that the data I want are in a field that isn't normally retrieved:

Is there a way to get that field, circled in blue, instead of the default field, circled in red, into a BED file from the UCSC table browser?

ucsc bed browser • 5.2k views
0
Entering edit mode

Thanks for three great suggestions! I'm trying them out now!

0
Entering edit mode

Thanks again everyone for the very helpful answers. I learned quite a bit about the table browser by going through them. What I want to do, however, is get a BED file, just like what I would normally get from selecting the "BED" option, except that I want the "name2" values instead of the "name" values. The more I dig around, the more it looks like this isn't possible in a direct fashion.

0
Entering edit mode

what do you mean by direct fashion?

0
Entering edit mode

sorry, let me clarify. Using the method that you suggested, I can indeed retrieve the data I want. However, the columns aren't in proper BED format. Want I ultimately want to do is visualize these genes on Galaxy's Trackster Visualization feature so that it has the same gene labels as the UCSC browser. Now, I could pull these data from the UCSC table browser, reformat them with a perl script, and then re-import them into Galaxy, but I'm trying to see if there's a more direct way of doing so, directly through the table browser.

0
Entering edit mode

Ahh ... i get it now ... and i agree that its not straight forward sometime ... you indeed have to re-process the information to get the desired format for further visualization. Looks like you are on the right track. You can even use excel quickly to rearrange the columns in the bed format.

5
Entering edit mode
10.6 years ago

using the ucsc mysql server:

\$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19  -N -e 'select chrom,txStart,txEnd,name2,strand from refGene '

chr19    20115226    20150277    ZNF682    -
chr17    40274755    40275371    HSPB9    +
chr1    34610    36081    FAM138A    -
chr6_qbl_hap6    2882090    2899191    PRRC2A    +
chr3    10327433    10334631    GHRL    -
chr2    220378891    220403494    ASIC4    +
chr17    18086866    18113267    ALKBH5    +
chr1    1658823    1677438    SLC35E2    -
chr1    700244    714068    LOC100288069    -
chr11    129872518    129875381    LINC00167    +

0
Entering edit mode

Hi, very good...

But how can I do that for USCS Genes instead refseq Genes?

Thank you

4
Entering edit mode
10.6 years ago
Gjain 5.7k

Hi Deedee,

If you change you selection of output file from "BED browser extensible file" to "Selected field from primary and related tables"and then choosing the fields you want in you output file.

Step 1: Selected field from primary and related tables

Step 2: choosing the fields you want in you output file:

This is the way you can do it from web browser tool. I hope this helps.

1
Entering edit mode
10.6 years ago

So, from the 2nd image, in the output format, select the option selected fields from primary and related tables then get output and then from the list, select the name2 checkbox for alternate names and whatever other attributes you want.

List looks like

bin
name    Name of gene (usually transcript_id from GTF)
chrom   Reference sequence chromosome or scaffold
strand  + or - for strand
txStart Transcription start position
txEnd   Transcription end position
cdsStart    Coding region start
cdsEnd  Coding region end
exonCount   Number of exons
exonStarts  Exon start positions
exonEnds    Exon end positions
score
name2   Alternate name (e.g. gene_id from GTF)
cdsStartStat    enum('none','unk','incmpl','cmpl')
cdsEndStat  enum('none','unk','incmpl','cmpl')
exonFrames  Exon frame {0,1,2}, or -1 if no frame for exon


Cheers