Question: gene coordinates from TxDb.Hsapiens.UCSC.hg19.knownGene do not match NCBI gene bank
0
gravatar for yliueagle
6 months ago by
yliueagle220
France
yliueagle220 wrote:

I retrieved gene cordinates from TxDb.Hsapiens.UCSC.hg19.knownGene using:

gene_info = genes(TxDb.Hsapiens.UCSC.hg19.knownGene)

But I found some of the genes, for example, PTPN20 (id: 26095) is hugely different from NCBI gene bank. It is error information or I am doing something wrong?

subset(gene_info, gene_id=='26095')

GRanges object with 1 range and 1 metadata column:
        seqnames            ranges strand |     gene_id
           <Rle>         <IRanges>  <Rle> | <character>
  26095    chr10 46550123-48827924      - |       26095

The gene coordinates I got from gene bank: https://www.ncbi.nlm.nih.gov/gene/?term=26095

gene • 189 views
ADD COMMENTlink modified 6 months ago by genomax80k • written 6 months ago by yliueagle220

It looks like this gene was annotated in two places in hg19 build (http://uswest.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000204179;r=10:46911396-47002488 look in other assemblies). This seems to have been resolved in hg38, where only one copy is annotated.

https://www.ncbi.nlm.nih.gov/gene/26095
https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:23423

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax80k

Adding on this, see hg19:

enter image description here

ADD REPLYlink written 6 months ago by ATpoint32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 923 users visited in the last hour