Technically the biotype is for a transcript, and not a gene. While in many cases the biotype of all transcripts for a gene will be the same, you get a few that aren't. That said I'm not sure off the top of my head if you can do a join within UCSC between the two tables. You might need to output two datasets (refseq and ensembl) and cross-correlate them yourself with a little script. You can output alternative IDs in both tables, and use whichever you prefer to link the two tables.
Update: This is a relatively step-by-step guide of how I generated a table linking ensembl Transcript IDs to RefSeq IDs:
UCSC Table Browser: https://genome.ucsc.edu/cgi-bin/hgTables
Group: Gene and Gene Predictions Group: Ensembl Table; ensGene
Output format: selected fields from primary and related tables
1) Click on get output
2) On next page under linked tables click the box beside ccdsInfo and then click allow selection from checked tables
3) More tables come up for linking click on ccds id under the CCDS table info fields. Also click on the table knownTo refseq and allow selection from checked tables again
4) Under known to refseq click both fields (primary id and value)
5) click get output near the top of the page under the fields for the ensembl table
You'll end up with ensembl transcript IDs in the first column and a list of NM IDs in the final column. You can then process this however you like to get a mapping of refseq IDs to Ensembl. I didn't poke around enough to see if I could find a linked table to get biotype IDs, that may be easier to get from BioMart on the ensembl website itself. With those two files you should be able to parse them as tab delimited data and create a mapping file, associate biotypes, etc.with a fairly simple perl/python/scripting language of choice script.
modified 3.5 years ago
3.5 years ago by
Dan Gaston ♦ 7.1k