Question: How To Get The The Ncrna Loci From Ucsc Genome Browser Downloads
8.6 years ago by
United States
ashis.csedu80 wrote:

Hi, I am looking for ncRNA loci after reading a paper where the authors stated that they collected the ncRNA loci from the UCSC genome Browser. I've been looking for it in the download section and also in the Table browser. Can you help me to find out that (for example a sample specis: Mouse (mm9)? Thanks.

mouse • 6.3k views
8.6 years ago by
Vikas Bansal2.4k
Berlin, Germany
Vikas Bansal2.4k wrote:

Hi. From NCBI Refseq key -

NR_123456    RNA    Mixed    Non-coding transcripts including structural RNAs, transcribed pseudogenes, and others.

It describes the ID's starting with NR.

A UCSC answer is here

The information you are requesting can be retrieved using the Table
> Browser. Click on "Tables" on the blue bar on the top of the main page
> and make the following selections:
> clade: Mammal
> genome: Human
> assembly: hg19
> group: Genes and Gene Prediction Tracks
> track: RefSeqGenes
> table: refGene
> region: genome
> filter: click 'create' then in the 'name DOES match' field type: NR_*
> output format: select fields from primary and related tables
> output file: if you would rather have the results saved to a file
> instead of displaying in the browser window, enter the name you would
> like the output file to have, otherwise, leave blank
> file type returned: plain text
> Click "get output". From here, select the following: name, chrom,
> strand. Click "get output". This will provide a list of non-coding RNA
> names, positions, and strands. To retrieve sequences, go back to the
> table browser and set the same settings, except set 'output format' to
> 'sequence'.

You can just change the genome and assembly according to your preference.

I didn't know about the NR_ prefix. Thanks. That makes it easy to use SQL:

mysql --user genome --host  -AD mm9 -e \
     "select chrom, txStart, txEnd, name, strand from refGene WHERE name like 'NR_%'
Much elaborate workflow to get ncRNAs! Thanks Vikas Bansal.

Hi, Vikas Bansal. "NR" represent non-coding gene in refseq gene annotation. How about ucsc known gene database? Which id represent non-coding RNA for ucsc known gene database?

8.6 years ago by
Chapel Hill, NC
Wjeck480 wrote:

Not sure if this will work, but you could try to use the CDS start and CDS stop fields. I believe they are both equal to the Transcription start field for noncoding RNAs. For coding genes CDS start != CDS stop

Thanks Wjeck. I got that by looking at the CDS positions. Really helps.

