Question: How do I get FASTA if i have a protein ID (in 10000's) ?
0
gravatar for sunnykevin97
7 months ago by
sunnykevin97140
sunnykevin97140 wrote:

HI

I have more than 10,000 protein IDS, I'm interested in extracting all the fasta sequences of these proteins ids from uniprot.

What I did so, far-- Already I downloaded all the fasta sequences of the organism I'm interested in.

How can I do, need suggestions.

uniprot sequence gene • 228 views
ADD COMMENTlink modified 7 months ago by Elisabeth Gasteiger1.7k • written 7 months ago by sunnykevin97140

Use blast+ preformatted nr database along with blastdbcmd utility. Use -entry_batch option to do a large number of accessions.

-entry_batch <File_In>
   Input file for batch processing (Format: one entry per line, seq id 
   followed by optional space-delimited specifier(s)

An example for a single accession below.

$ blastdbcmd -db /path_to/blastv5/nr_v5 -entry Q9I7U4 -outfmt %f
ADD REPLYlink modified 7 months ago • written 7 months ago by genomax89k

Moving this to a comment since nr may not contain all UniProt ID's and if that is all you have then this would not be sufficient.

ADD REPLYlink written 7 months ago by genomax89k
1
gravatar for JC
7 months ago by
JC11k
Mexico
JC11k wrote:

You can fetch them directly from Uniprot, if you know the uniprot ID the fasta sequence can be retrieved from the URL https://www.uniprot.org/uniprot/{UNIPROT_ID}.fasta

So, if you have a file with the IDs (one per line):

for ID in $(cat file_with_ids.txt); do wget https://www.uniprot.org/uniprot/$ID.fasta; done

ADD COMMENTlink written 7 months ago by JC11k
0
gravatar for Elisabeth Gasteiger
7 months ago by
Geneva
Elisabeth Gasteiger1.7k wrote:

You can upload your list of identifiers to the UniProt batch retrieval tool at https://www.uniprot.org/uploadlists Please don't hesitate to contact the UniProt helpdesk if you have any additional questions.

ADD COMMENTlink written 7 months ago by Elisabeth Gasteiger1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1177 users visited in the last hour