Question: Database to map gene ID to the chromosome where it is located?
1
gravatar for Andrei Kucharavy
4.6 years ago by
Switzerland/Lausanne/EPFL
Andrei Kucharavy10 wrote:

I am looking for a database, preferentially with a .csv or .txt dump that would be able to convert gene accession handles (Name, Gene Id, UNIPROT IDs, EMBL accession numbers, ...)  to the chromosomes they are assigned in their specific organism.

This data is usually shown in web rendering of the Uniprot proteins, but is absent from the original .txt data dump as far as I know.

gene • 2.0k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Andrei Kucharavy10
1
gravatar for Alex Reynolds
4.6 years ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

One way to do it is to grab the archive of the gene annotations from your source of choice with wget or curl, filter the result for genes with awk, and then convert to BED and awk to get the fourth and first columns (ID and chromosome values), e.g.,:

​$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_21/gencode.v21.annotation.gff3.gz \
    | gunzip --stdout - \
    | awk '$3=="gene"' - \
    | convert2bed -i gff - \
    | awk '{print "$4\t$1";}' - \
    > gene_id_and_chromosome.txt
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Alex Reynolds29k

Thank you for your answer! I see there is a way to do the same thing for the mouse thanks to the same resource. Is there a way to retrieve the mapping for Saccharomyces Cerevisiae?

ADD REPLYlink written 4.6 years ago by Andrei Kucharavy10
1

Take a look at the GFF or GTF files in the archives here, maybe this will help: http://downloads.yeastgenome.org/sequence/S288C_reference/genome_releases/?C=M;O=D

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Alex Reynolds29k
1

Andrei! Hello from Seattle! I'm looking for a way to find the chromosome location for the uniprot.dat file. Any chance you know where I can find that? 

ADD REPLYlink written 4.6 years ago by summerela110

Hello Summer, hope you are doing well there! Cf my answer, hope it helps.

ADD REPLYlink written 4.6 years ago by Andrei Kucharavy10
0
gravatar for TriS
4.6 years ago by
TriS4.0k
United States, Buffalo
TriS4.0k wrote:

bioDBnet does what you want

ADD COMMENTlink written 4.6 years ago by TriS4.0k

Could you be a little bit more explicit?

 

ADD REPLYlink written 4.6 years ago by Andrei Kucharavy10
0
gravatar for Andrei Kucharavy
4.6 years ago by
Switzerland/Lausanne/EPFL
Andrei Kucharavy10 wrote:

It seems that Uniprot has "Proteomes" object some of which actually map to the chromosomes.

In Uniprot, this is availabe under the "DR;    EMBL; BKXXXXX; ... ;. .. ;..." field (for yeast, CMXXXX for humans). 

A mapping from BKXXX (CMXXX) references seems to be obtainable manually from the Uniprot proteome links.
 

It seems that the mapping is also readily available in BioConductor among the gene location mapping tools too.

ADD COMMENTlink written 4.6 years ago by Andrei Kucharavy10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 920 users visited in the last hour