Which taxid is a right taxid for downloading GI list and making blast database using it
1
0
Entering edit mode
10.3 years ago
seta ★ 1.9k

Hi everybody,

I'm trying to download the GI list of plant taxid to run blastx against only plant sequences of nr database. But, I found several plant taxids, including green plants, flowering plants, land plants, ... with different number records. Also, it sounds that some records is the same between two different taxids, so I cannot download more than one because of redundancy. Could you please help me to make right decision? Also, please let me know is there any way to count the number of downloaded GI to make sure the related GI list was completely downloaded. Sorry, if you find this post is so basic, but it's a kind of challenge for me as a beginner. Thanks

blast sequencing alignment RNA-Seq • 2.2k views
ADD COMMENT
0
Entering edit mode
10.3 years ago
h.mon 35k

First, to check if your GI list file has the correct number of records:

wc -l gi_list.txt

should output the same number as informed by NCBI.

Second, you do not say which plants are you interested in, so I can not say which is the "correct" list. Are you interested in algae? Gymnosperms? Angiosperms? The "correct" list depends on what you want.

ADD COMMENT
0
Entering edit mode

Thanks friend. I would like to download all plant record to make plant specific nr database. As I mentioned in post, there are several plant taxid (like, green plants, flowering plants, land plants, ... ) that sounds that some records are the same among them, so there is redundancy with downloading GI list of all related plant taxids. Could you please let me know if there is a way to detect identical GI and remove them?

ADD REPLY

Login before adding your answer.

Traffic: 3213 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6