Question

Blast scores...two annotations for the same piece of sequence

1

Entering edit mode

10.3 years ago

friasoler ▴ 50

Hello everybody!!!

I have a sequence of DNA that matches with two different proteins depending whether I look at the scores or at the sequence identity in BLAST....Which criterion should I trust the most? I have designed primers using this sequence to measure the gene expression of this gene, that's why it is so critical for me to know the exact match to the sequence. Here is the sequence:

GCCGCAGCCCCGCTGCAGACGCGCCGCGTCCCCGCCGGAGAAGGAGCGAGGCCGTTCCCTGCGCATCCTGCAGCAGCATGACTCTTCAGGCTGACTTTGATGGTGCTGCAGAAGATGTAAAAAAaTTAAAAaCaAGACCAACTGATGAAGAACTGAAGGAACTATATGGATTCTACAAACAGGCTACTGTTGGAGATATTAATATTGAATGTCCAGGAATGCTAGATTTGAAAGGCAAAGCCAAATGGGAGGCATGGAACCTGAAAAAAGGTTTATCAAAGGAGGATGCCATGAATGCCTATATCTCTAAAGCAAGAGCAATGGTAGAAAAATATGGAATCTAGAATATTCAAAATAATTCCCACTAATAATTAACTACTCTTCAGTAGCTGATGAACTAACTTGAGAAAAAcGCAGTACTAACTCCTTTTTGTGTAGTCTGACACTAATATCTTTTAAGCATCAGCTGTTTGACTTTAAAGGGTATTTACATATATAATCGATTTTTAGCTTGTATATTAATCTAAATAAATTTGAACTGAATAAATTAAGCTTTATTAAGAATTGTGGATTTTtGTGGGTATTAAATTATATTTAGCATTTTGACAGAAGAAGACAAACAGAAAAGCTCTAACAGTTAAATAACATAGACATGATTTTTTGCAAGCAAGGTTATGGAATAAAGTGAAGAGTTTGTGCATAAGGAAGAGAAGAAGGAAAAGATGAAACCTTTTTtAAGACCCAAAGCCAATGTTTGaTTTTTAAAAAaaTCAGGAAAaCTTCCCCTTATAAAGGATTACAGAGGAGGACCAGAACAACTTTTAGGCATAACTGCATGCAATGTAGAGAAaGAAGTGACTTATTATAAATTGCTGTGGACTAACCTACACATTCTGCCATTAAAaTTGaGGgAAaTaCTCAtAGACTGGCaTTTTcTATGCATGTTGtGATATGTTTTATCAAGAAacTTTCATTAGATGGTTTCAGcAGATAAAAGTGATCTCCAGGAAGgTCATAAAAGGAAACATCtCCaTTTGTtAGTtCTtGCcAaCCTAAAAAaGATATTtGAAGTGTCAGAGAAaC

Thanks in advance

Roberto

alignment • 2.9k views

ADD COMMENT • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by friasoler ▴ 50

2

Entering edit mode

I guess that depends on what the score/identity values is. if it is in a gray zone then this is a tricky question but if not than:

In Score we trust.

ADD REPLY • link 10.3 years ago by mxs ▴ 530

2

Entering edit mode

Bitscore > Evalue > Identity

High identity means nothing by itself because it can be for a very short alignment covering just a tiny proportion of the query sequence. Keep that in mind. Basic Local Alignment Search Tool.

ADD REPLY • link 10.3 years ago by 5heikki 11k

1

Entering edit mode

As you said: "High identity means nothing by itself because it can be for a very short alignment covering just a tiny proportion of the query sequence". That is why in grey zone ( e = 10^-2 - 10^-4) the two values can give different results and thus pose a challenge while interpreting. In such cases I would always trust score value over identity. So I don't quite get your reply to my post.

ps

Thumbs up

ADD REPLY • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by mxs ▴ 530

1

Entering edit mode

Well, it was really meant as a reply for OP. Also, generally I wouldn't even consider hits with such high evalue, I mean 1 in 100 or even 10,000 isn't very good if your db has millions of sequences.

ADD REPLY • link 10.3 years ago by 5heikki 11k

0

Entering edit mode

Thanks for your answers .-)

I have this extreme alternatives for the Alignment :

PREDICTED: Ficedula albicollis acyl-CoA-binding protein-like (LOC101820061), mRNA
Max score Total score Query cover E value Iden
887             887                46%                0.0      98%  
Select seq ref|XM_005040820.1|             PREDICTED: Ficedula albicollis S-acyl fatty acid synthase thioesterase, medium chain-like (LOC101815966), mRNA
Max score Total score Query cover E value Iden
239              239              12%                8e-59    99%

If I follow your criteria I have to choose: acyl-CoA-binding protein-like?

Thanks
roberto

ADD REPLY • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by friasoler ▴ 50

2

Entering edit mode

My guess is that these are multi domain proteins and in the case of the second hit you're getting a nice hit to one domain, whereas in the first case you're getting a hit that covers multiple domains..

ADD REPLY • link 10.3 years ago by 5heikki 11k

1

Entering edit mode

I second that. so if you are looking for a gene and not a domain the first hit should be your choice.