Question: Are The Query Strand And Hit Strand From Blastx The Same Strand?
1
gravatar for David Maddison
7.6 years ago by
Corvallis, Oregon
David Maddison10 wrote:

I'm writing code to parse the XML output from a BLASTX search on NCBI's servers, and from this grab from NCBI the nucleotide sequence whose translated protein was found to be a hit by BLASTX . This is working, and I am tracking back from the protein ID to the original nucleotide record ID to do this. But it is possible that my original query sequence is from the other strand, and thus BLASTX may have had to reverse complement it to find the match. I want to know if that is so, in order to automatically reverse-complement one of them. BLASTX in discovering the hit knew whether it was a reverse-complement or not, but I can't see any hint of it reporting this back to me in the XML, nor can I see how to get that info from other queries. Yes, I can try go through the trouble myself of reverse-complimenting the hit sequence, try the three reading frames, translate to amino acids, and see if it matches better with my original sequence, but that's what BLASTX has just done and I would rather just get the info from the BLASTX and/or the NCBI databases. Can I?

That is, how can I determine if BLASTX had to reverse-compliment my query sequence when it found a hit?

Thanks! David

xml blast strand • 3.9k views
ADD COMMENTlink written 7.6 years ago by David Maddison10
1
gravatar for Damian Kao
7.6 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

There is a query frame tag in the xml file that'll tell you what frame the hit is in. For example here is a line from one of my blast xml output:

<Hsp_query-frame>3</Hsp_query-frame>

Alternatively, You can look at the query start/end and subject start/end coordinates. If your query aligns to reverse of subject, then the subject end coordinate will be smaller than the start coordinate. For example, if SeqA aligns to reverse of SeqB, you might see position 50-100 of SeqA aligning to position 200-150 of SeqB.

ADD COMMENTlink modified 7.6 years ago • written 7.6 years ago by Damian Kao15k

Thanks! The query start/end one makes sense to me, but how does query-frame tell me the direction?

ADD REPLYlink written 7.6 years ago by David Maddison10

Alas, the query_from/query_to and hit_from/hit_to scheme doesn't work.

I have an example where BLASTX returns query_from less than query_to AND hit_from less than hit_to, and yet the query sequence is definitely the reverse complement of both the response protein sequence, and the response DNA sequence if you go after the nucleotide sequence that is eLinked to the protein BLASTX response.

Any other thoughts on how to solve this?

ADD REPLYlink written 7.6 years ago by David Maddison10

If the Hsp_query-frame is positive (plus strand) then the supplied query sequence was matched. Alternatively if Hsp_query-frame is negative then the reverse complement (minus strand) of the query sequence was matched.

ADD REPLYlink written 7.6 years ago by Hamish3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 796 users visited in the last hour