why there is difference in the number of refseq protein sequences in NCBI?
0
0
Entering edit mode
8.8 years ago
seta ★ 1.9k

Hi everybody,

I downloaded refseq protein sequences (just plants) using the command:

curl -o plant.#1.protein.faa.gz ftp://ftp.ncbi.nlm.nih.gov/refseq/release/plant/plant.\[1-90\].protein.faa.gz

They completely downloaded, the number of sequences in the file was 1973246 while the number of plant refseq protein sequences in the http://www.ncbi.nlm.nih.gov/protein/?term=viridiplantae[org] is 2067967. There is about 94721 difference in sequence count. Could you please let me know your opinion about is?

Thanks

blast alignment sequence • 1.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 2717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6