Question: How do I get FASTA if i have a protein ID (in 10000's) ?
0
gravatar for sunnykevin97
23 days ago by
sunnykevin9730
sunnykevin9730 wrote:

HI

I have more than 10,000 protein IDS, I'm interested in extracting all the fasta sequences of these proteins ids from uniprot.

What I did so, far-- Already I downloaded all the fasta sequences of the organism I'm interested in.

How can I do, need suggestions.

uniprot sequence gene • 114 views
ADD COMMENTlink modified 4 days ago by Elisabeth Gasteiger1.7k • written 23 days ago by sunnykevin9730

Use blast+ preformatted nr database along with blastdbcmd utility. Use -entry_batch option to do a large number of accessions.

-entry_batch <File_In>
   Input file for batch processing (Format: one entry per line, seq id 
   followed by optional space-delimited specifier(s)

An example for a single accession below.

$ blastdbcmd -db /path_to/blastv5/nr_v5 -entry Q9I7U4 -outfmt %f
ADD REPLYlink modified 23 days ago • written 23 days ago by genomax78k

Moving this to a comment since nr may not contain all UniProt ID's and if that is all you have then this would not be sufficient.

ADD REPLYlink written 23 days ago by genomax78k
1
gravatar for JC
23 days ago by
JC9.5k
Mexico
JC9.5k wrote:

You can fetch them directly from Uniprot, if you know the uniprot ID the fasta sequence can be retrieved from the URL https://www.uniprot.org/uniprot/{UNIPROT_ID}.fasta

So, if you have a file with the IDs (one per line):

for ID in $(cat file_with_ids.txt); do wget https://www.uniprot.org/uniprot/$ID.fasta; done

ADD COMMENTlink written 23 days ago by JC9.5k
0
gravatar for Elisabeth Gasteiger
4 days ago by
Geneva
Elisabeth Gasteiger1.7k wrote:

You can upload your list of identifiers to the UniProt batch retrieval tool at https://www.uniprot.org/uploadlists Please don't hesitate to contact the UniProt helpdesk if you have any additional questions.

ADD COMMENTlink written 4 days ago by Elisabeth Gasteiger1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 803 users visited in the last hour