Is there any way to extract multiple protein sequences given in the published paper using either its PMID, DOI or Supplementary files.
It's unlikely you will be able to go directly from a paper DOI to a genetic sequence. If the paper lists the databases they uploaded the data to, with accession numbers etc, then it might be possible, but we'd need more information about what the paper says exactly.
Yes some paper mentioned about the accession number but other paper haven't mentioned accession number of protein other than the number of protein they got while doing genome-wide studies of specific plant species. That's why I am looking for some program using the title,PMID or DOI to download.
Caveat: This is likely not going to work for most papers. But if you have the right PMID then you could do the following.
$ esearch -db pubmed -query 22753475 | elink -target nuccore | elink -target protein | efetch -format fasta | grep ">" | head -10
>NP_001292578.1 uncharacterized protein LOC103503105 [Cucumis melo]
>NP_001284396.1 uncharacterized LOC103502119 [Cucumis melo]
>NP_001284656.1 Transcription factor HY5-like [Cucumis melo]
>NP_001284432.1 ABSCISIC ACID-INSENSITIVE 5-like protein 2-like [Cucumis melo]
>NP_001284448.1 Sodium/hydrogen exchanger 2-like [Cucumis melo]
>NP_001284444.1 TMV resistance protein N-like [Cucumis melo]
>NP_001284453.1 ethylene receptor 1 [Cucumis melo]
>NP_001284384.1 alpha-farnesene synthase [Cucumis melo]
>NP_001284474.1 profilin [Cucumis melo]
>NP_001284461.1 translationally-controlled tumor protein homolog [Cucumis melo]
First of of thank you so much for replying again
So the number '22753475' is the PMID I guess but what about the last line 'grep ">" | head -10' for?
Are we limiting the number of result we want, because you got exactly the 10 result here
And it's been 10 mins now I executed this command and still its under process
22753475 is the PMID. I added the part starting with grep onwards to demonstrate that this works. You will need to take that part out to save the sequence. Simply redirect to a file esearch .. blah > seq.fa.
esearch .. blah > seq.fa
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy