What Makes A Valid Gi List?
2
0
Entering edit mode
12.3 years ago

Hello all, I cannot figure out what makes a "correct" GI list!

Atm I have a list that I produced this using the query below and a awk line to just retrieve the second column values


blastn -word_size 11 -reward 2 -penalty -3 -gapopen 5 gapextend 2 -query hiv_reference.fa -db test -outfmt 6 >results.txt

awk "print $2" results.txt > giResults.txt

G3ME69S02DME82 G3ME69S02EPWTE G3ME69S02DE0PT G3ME69S02C1ABI G3ME69S02EQ3LK G3ME69S02ELVHG G3ME69S02CY21D G3ME69S02D9A53

So that is what I get from running those two lines, and from what I understand those are GI values, so why can I not do


blastdb_aliastool -gilist giResults.txt -dbtype nucl -db test -out test_subset

I get an error that its not a valid GI list.

Any ideas?

Any input is much appreciated

blast blast blastn • 4.2k views
ADD COMMENT
0
Entering edit mode

Maybe this topic will help you (especially my comments ;) ).

http://biostar.stackexchange.com/questions/1196/extracting-sequence-from-a-3gb-fasta-file

ADD REPLY
1
Entering edit mode
12.3 years ago
Neilfws 49k

from what I understand those are GI values

Those are not GI values. A GI is an identifier assigned to a sequence record by the NCBI. It's simply a series of digits such as 12345 (in that case, a protein sequence record).

In a BLAST report, this would look like:

gi|12345|

What you have is some other kind of sequence identifier which is not a GI.

ADD COMMENT
0
Entering edit mode

Excellent, thank you very much I'll look into that I report back with how I get on! :D

ADD REPLY
0
Entering edit mode

Yeah i think they are all sequence ID's, so is there a special type of query you have to do in order to get the GI's back?

ADD REPLY
0
Entering edit mode

Well, where did those sequence IDs originate? When I search for them in Google, this question is the only hit. So I doubt that they are linked to GIs in any publicly-available resource.

ADD REPLY
0
Entering edit mode
12.3 years ago
Iván ▴ 60

I haven't used a gilist in BLAST, but maybe this link will be of help, although it's not for BLAST+ exactly. Check out specially the "Alias file structure" bit. Hope it helps!

ADD COMMENT

Login before adding your answer.

Traffic: 2885 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6