Number of pseudogenes in NCBI GRCh37
1
1
Entering edit mode
9.7 years ago
Joey ▴ 430

I downloaded the NCBI GRCh37 gene list from https://bitbucket.org/baderlab/fast/downloads.

What is the best way to find out how many are pseudogenes in this list? I tried Biomart but I wasn't able to do a full merge.

Thanks,

-Joey

genes pseudogenes • 2.7k views
ADD COMMENT
3
Entering edit mode
9.7 years ago

I can't help you with your list but if you just need to get an idea then you can use UCSC table browser:

group: Genes and Gene Prediction Tracks
track: GENCODE Genes v12
table: Pseudogenes (wgEncodeGencodePseudoGeneV12)

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg38 -e ' select distinct name,name2 from wgEncodeGencodePseudoGeneV19'
ADD COMMENT
0
Entering edit mode

Let me slightly update this answer with recent information from GENCODE V31 (July 2019, http://genome.ucsc.edu/goldenpath/newsarch.html ) which contains 18.536 annotated pseudogenes (not including polymorphic pseudogenes, according to documentation)

  • group: Genes and Gene Prediction Tracks
  • track: GENCODE Genes v12
  • table: Pseudogenes (wgEncodeGencodePseudoGeneV31lift37)
  • output format: select fields from primary and related tables

Specify output file name and check chrom, cdsStart, cdsEnd and name2 in the required fields to get the desired .bed file

ADD REPLY

Login before adding your answer.

Traffic: 2263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6