Hello,
Sorry for the newbie question, but I am having a hard time trying to figure out how to retrieve the set of all curated RefSeq transcripts (with accession prefix NM_
) from hg18. Could you help me build the SQL query for this?
Thanks.
Hello,
Sorry for the newbie question, but I am having a hard time trying to figure out how to retrieve the set of all curated RefSeq transcripts (with accession prefix NM_
) from hg18. Could you help me build the SQL query for this?
Thanks.
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -D hg18 -e 'SELECT chrom,txStart,txEnd,strand,cdsStart,cdsEnd,exonCount,name from refGene LIMIT 10'
+-------+-----------+-----------+--------+-----------+-----------+-----------+-----------+
| chrom | txStart | txEnd | strand | cdsStart | cdsEnd | exonCount | name |
+-------+-----------+-----------+--------+-----------+-----------+-----------+-----------+
| chr20 | 764710 | 774922 | + | 773447 | 774335 | 2 | NM_207121 |
| chr9 | 139406074 | 139437535 | - | 139437535 | 139437535 | 3 | NR_104599 |
| chr21 | 42315181 | 42318098 | + | 42318098 | 42318098 | 2 | NR_119385 |
| chr21 | 42302371 | 42318098 | + | 42318098 | 42318098 | 2 | NR_119384 |
| chr17 | 45993427 | 46059831 | + | 46059831 | 46059831 | 35 | NR_046057 |
| chr7 | 44120803 | 44129694 | - | 44120908 | 44129683 | 11 | NM_006230 |
| chr5 | 134209268 | 134223324 | + | 134218489 | 134219056 | 2 | NM_152409 |
| chr5 | 75005779 | 75044036 | - | 75006015 | 75043965 | 11 | NM_152408 |
| chr12 | 54882554 | 54901982 | - | 54886497 | 54890496 | 8 | NM_194358 |
| chrX | 154140712 | 154147068 | - | 154143281 | 154146767 | 2 | NM_171998 |
+-------+-----------+-----------+--------+-----------+-----------+-----------+-----------+
Check the table schema for refGene from ucsc table browser.
Hello,
Thanks. Actually I wanted to check the following statement from a recent paper:
The set of transcripts used in this experiment were the curated RefSeq transcripts (accession prefix NM) from hg18 (31,148 transcripts).
However I don't find the same number by querying RefGene from hg18:
mysql> select distinct count(*) as total from refGene where name like "NM%";
+-------+
| total |
+-------+
| 38938 |
+-------+
1 row in set (0,31 sec)
Is something wrong in my interpretation or in my query?
Thanks for your help.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
The NM IDs in RefSeq are not unique. One transcript can occur on several loci. Perhaps that might explain the difference.