Question: Which taxid is a right taxid for downloading GI list and making blast database using it
gravatar for seta
5.7 years ago by
seta1.4k wrote:

Hi everybody,

I'm trying to download the GI list of plant taxid to run blastx against only plant sequences of nr database. But, I found several plant taxids, including green plants, flowering plants, land plants, ... with different number records. Also, it sounds that some records is the same between two different taxids, so I cannot download more than one because of redundancy. Could you please help me to make right decision? Also, please let me know is there any way to count the number of downloaded GI to make sure the related GI list was completely downloaded. Sorry, if you find this post is so basic, but it's a kind of challenge for me as a beginner. Thanks


ADD COMMENTlink modified 5.7 years ago by h.mon32k • written 5.7 years ago by seta1.4k
gravatar for h.mon
5.7 years ago by
h.mon32k wrote:

First, to check if your GI list file has the correct number of records:

wc -l gi_list.txt

should output the same number as informed by NCBI.

Second, you do not say which plants are you interested in, so I can not say which is the "correct" list. Are you interested in algae? Gymnosperms? Angiosperms? The "correct" list depends on what you want.

ADD COMMENTlink written 5.7 years ago by h.mon32k

Thanks friend. I would like to download all plant record to make plant specific nr database. As I mentioned in post, there are several plant taxid (like, green plants, flowering plants, land plants, ... ) that sounds that some records are the same among them, so there is redundancy with downloading GI list of all related plant taxids. Could you please let me know if there is a way to detect identical GI and remove them? 

ADD REPLYlink written 5.7 years ago by seta1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1927 users visited in the last hour