This is my first post, so don't hesitate to tell me if i'm not efficently clear in my explanations.
I would like to annotate a Illumina SNP file and I need to compare it to a Human Genome annotated file with the GRCh37 build (I don't care about de patch, just the build is important).
To be efficient in my comparison , I need several informations in the Human genome file.
I need at least :
- HGNC symbol
- start gene position (bp)
- end gene position (bp)
There is no real problem to get these informations, I found it in UCSC or Biomart.
But I have a problem with NCBI symbol starting with LOC (i.e : LOC100287633, LOC100128613 etc...)
I compared NCBI and UCSC informations, and I can find every LOC symbols in NCBI but not in UCSC or Biomart.
I know that there are a lot of LOC symbols which are "discontinued" or not updated, however plenty of these symbols are still reviewed in NCBI but unfindable in Biomart or UCSC or other databases.
I could download them from NCBI, but their "start and end positions (bp) " are updated to the GRCh38 , and I absolutely need the GRCh37 positions.
So my question is : Do you know a web link, ftp link, where i can download all these informations in a single file , or just to download LOC informations with GRCh37 build ?
Thanks for your answers !