Getting Uniprot IDs of a set of Genes
1
0
Entering edit mode
2.1 years ago
pooryamb • 0

Dear all,

I am working on a set of genes. I want to get the uniprot ID of each of them to draw their structures from the Alphafold website. I see each of my genes has been registered in UniProt twice; for instance, see the following link; it is one of my genes, here. Even though both genes are identical, only one of them has predicted Alphafold structure! I want to get the uniprot ID with an alphafold structure for each gene. How can I do it? I am working with a long list of genes, and I can't do it manually.

UniprotKB ID_mapping • 1.2k views
ADD COMMENT
2
Entering edit mode

have you tried approaching this from the other direction? AlphaFold's website hosts information on organism-specific predicted structures here - https://alphafold.ebi.ac.uk/download. I am guessing the species that you are working with is Leishmania infantum, which is also listed here. On downloading the compressed file, and if I understand this correctly, you will get a pdb and mmCIF files per UniProt ID of the species, i.e all the UniProt IDs that have an AlphaFold structure will be shown here. If you only want the IDs, then you can use a simple "grep" command to extract them

ADD REPLY
0
Entering edit mode

Thanks for your reply. You gave me a great clue. Actually, I have an alternative ID for each gene, and I want to extract the genes with the ID I have at hand. However, the pdb and cif files don't contain my IDs. They just show the uniprot ID. I can create a fasta file whose gene names are uniprot IDs of proteins in Alphafold, and I can extract protein sequences from .pdb files. Then by blasting the fasta file containing my gene IDs against the created fasta file, I can find the mapping.

ADD REPLY
0
Entering edit mode
2.1 years ago

FYI UniProt will have cross-references to AlphaFoldDB as of release 2022_02, scheduled for May 25th, 2022. For your example gene, LINF_330036600, UniProtKB/TrEMBL entry A0A6L0XNC3 will be linked to AlphaFoldDB.

ADD COMMENT
0
Entering edit mode

From my previous visits to some protein entries on Uniprot, I was of the impression that some Uniprot entries are already linked to AlphaFold - as an example: https://www.uniprot.org/uniprot/P46736 already has an AlpaFold entry attached. Is this what you are referring to? Or you mean this will be expanded?

ADD REPLY
1
Entering edit mode

Currently, the links are added dynamically on the website if a model is available in AlphaFoldDB. However, as it stands now, before release 2022_02 (now more likely to be published in June) it is not possible to search for UniProtKB entries for which AlphaFold has predicted a structure. As I understood, this is what the thread author wanted to do.

And once we have these explicit cross-references in UniProtKB data, searchable on the website (and not only displayed in relevant entries dynamically), this will be possible.

Just like it is now possible for other databases, e.g. PDB: https://www.uniprot.org/uniprot/?query=database%3A%28type%3Apdb%29

ADD REPLY

Login before adding your answer.

Traffic: 2351 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6