Qcov Vs QcovHSP
1
0
Entering edit mode
3.8 years ago
Seigfried ▴ 80

I thought I understood the difference between Qcovs and Qcovhsp

I have a "reference" sequence 100bp long. I converted it into a custom blast database Reference_DB. So this database contains only 1 fasta sequence.

I ran BLAST using this command

blastn -db Reference_DB -query Query.fa -out Query_out.txt -outfmt "6 qseqid sseqid pident evalue qcovs qcovhsp qlen" -max_target_seqs 1 -word_size 5

I wanted to BLAST against this reference Now my query sequence lengths are 100/150 bp so they are longer or equal to the reference length. So I assume that for 151 bp sequences even with 100% alignment, my Query cover cannot exceed 100/151 = 66.66% ish. But I still see sequences with 73%+ Qcov.

Query   Sequence    pIdent  eValue  Qcov    QCovHSP Qlen
K00151:489:H5KKTBBXY:1:2118:10429:37818 Reference_DB    100 0.51    91  7   100
K00151:489:H5KKTBBXY:1:2118:10429:37818 Reference_DB    84.615  1.8 91  12  100
K00151:489:H5KKTBBXY:1:2118:10429:37818 Reference_DB    90  1.8 91  9   100
K00151:489:H5KKTBBXY:1:2118:10500:31576 Reference_DB    91.667  0.04    71  12  100
K00151:489:H5KKTBBXY:1:2118:10531:16225 Reference_DB    100 0.14    79  8   100
K00151:489:H5KKTBBXY:1:2118:10531:16225 Reference_DB    100 0.51    79  7   100
K00151:489:H5KKTBBXY:1:2118:10531:16225 Reference_DB    81.25   1.8 79  16  100
K00151:489:H5KKTBBXY:1:2118:10531:16225 Reference_DB    83.333  6.6 79  11  100
K00151:489:H5KKTBBXY:1:2118:10795:20304 Reference_DB    100 0.14    77  8   100

1) Why does my query coverage exceed 66.66% for the 151bp sequences? I can see that it doesnt go above 80% but I still see alignments with 73-74% Qcov (higher than 66%) How is this possible?

2) I used -max_target_seqs 1 and I am getting multiple hits for the same fasta. Is it correct to assume that max_target_seqs will just report all the HSPs it gets, restricted to that one sequence (since my reference only has 1 sequence anyway). So what do the multiple entries for the same fasta sequence mean ? I see the pIdent (for HSP) is 100% in the first row but the qcovhsp is just 7%

Thanks for your help

BLAST • 2.8k views
ADD COMMENT
0
Entering edit mode

the command you used is not showing qcov and qcovhsp, what to do now

blastn -db Reference_DB -query Query.fa -out Query_out.txt -outfmt "6 qseqid sseqid pident evalue qcovs qcovhsp qlen" -max_target_seqs 1 -word_size 5

ADD REPLY
2
Entering edit mode
3.8 years ago
Chirag Parsania ★ 2.0k

Why does my query coverage exceed 66.66% for the 151bp sequences? I can see that it doesn't go above 80% but I still see alignments with 73-74% Qcov (higher than 66%) How is this possible?

Well, you see all your queries are of length 100 at least in the data you posted. Also, you have Qcov more than 80 (91 for the first three queries), which you mentioned that you do not see more than 80. Correct me if I am wrong

2) I used -max_target_seqs 1 and I am getting multiple hits for the same fasta. Is it correct to assume that max_target_seqs will just report all the HSPs it gets, restricted to that one sequence (since my reference only has 1 sequence anyway). So what do the multiple entries for the same fasta sequence mean ? I see the pIdent (for HSP) is 100% in the first row but the qcovhsp is just 7%

When you restrict the output by max_target_seqs param, it applies on subject ids, which means you will have only one unique subject hit by subject ID. You can have multiple hits of same subject id. When you have multiple hits from the same subject id, all of them will have different alignment co-ordinates.

========UPDATE-=================

With regards to your first question, what if you have gaps in reference and no gaps in query ? Due to this, you can have qCov more than 66%

ADD COMMENT
0
Entering edit mode

Trying to post here

Query   Sequence    pIdent  eValue  Qcov    QCovHSP Qlen
K00361R:231:H5KJJBBXY:5:1105:20405:25527    Reference_DB    86.667  0.22    70  9   151
K00361R:231:H5KJJBBXY:5:1105:20405:25527    Reference_DB    90  0.78    70  7   151
K00361R:231:H5KJJBBXY:5:1105:20405:25527    Reference_DB    78.947  2.8 70  11  151
K00361R:231:H5KJJBBXY:5:1105:20405:25527    Reference_DB    90  2.8 70  7   151
K00361R:231:H5KJJBBXY:5:1106:19999:19091    Reference_DB    81.481  0.005   72  18  151
K00361R:231:H5KJJBBXY:5:1106:19999:19091    Reference_DB    91.667  0.06    72  8   151
K00361R:231:H5KJJBBXY:5:1106:19999:19091    Reference_DB    80  0.78    72  13  151

I have multiple original read lengths. This is for 151 bp. Here the Qcov is exceeding 66% which is my first question.

So each row is a different alignment with its own respective HSPs

ADD REPLY

Login before adding your answer.

Traffic: 2047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6