Question: Is there a way to find the percent similarity just like percent identity in BLAST?
0
gravatar for Neha shri
4.2 years ago by
Neha shri30
United States
Neha shri30 wrote:

I am using standalone BLAST, version 2.2.26 for which i have a query sequence and a locally created database. The sequences in the database share a 65 percent identity with the query.

  Unlike identity i want the database to be sorted on the basis of similarity like A= (V,L,I,M) and not A=A. Hope i am making myself clear. Will really appreciate any help. Thank you in advance

blast sequence • 6.9k views
ADD COMMENTlink modified 4.2 years ago by cpad011212k • written 4.2 years ago by Neha shri30
1

"A= (V,L,I,M) and not A=A. Hope i am making myself clear."

You're not.

ADD REPLYlink written 4.2 years ago by 5heikki8.6k
1

Sorry for not being clear. In case of identity, the program searches for exact matches, for example- If (A) Alanine is replaced by only Alanine then it is a match otherwise not. But another case could be - if A (Alanine) gets substituted by any other hydrophobic residues ( ex-V,L,I,M) then it is also considered a match since they share similar characteristics. Is there a way to find those matches (the later case) in form of percentage?

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Neha shri30
1

Yes, there is. I suppose the used substitution matrix affects these numbers..

-outfmt <String>
   alignment view options:
     0 = pairwise,
     1 = query-anchored showing identities,
     2 = query-anchored no identities,
     3 = flat query-anchored, show identities,
     4 = flat query-anchored, no identities,
     5 = XML Blast output,
     6 = tabular,
     7 = tabular with comment lines,
     8 = Text ASN.1,
     9 = Binary ASN.1,
    10 = Comma-separated values,
    11 = BLAST archive format (ASN.1)

   Options 6, 7, and 10 can be additionally configured to produce
   a custom format specified by space delimited format specifiers.
   The supported format specifiers are:
           qseqid means Query Seq-id
              qgi means Query GI
             qacc means Query accesion
          qaccver means Query accesion.version
             qlen means Query sequence length
           sseqid means Subject Seq-id
        sallseqid means All subject Seq-id(s), separated by a ';'
              sgi means Subject GI
           sallgi means All subject GIs
             sacc means Subject accession
          saccver means Subject accession.version
          sallacc means All subject accessions
             slen means Subject sequence length
           qstart means Start of alignment in query
             qend means End of alignment in query
           sstart means Start of alignment in subject
             send means End of alignment in subject
             qseq means Aligned part of query sequence
             sseq means Aligned part of subject sequence
           evalue means Expect value
         bitscore means Bit score
            score means Raw score
           length means Alignment length
           pident means Percentage of identical matches
           nident means Number of identical matches
         mismatch means Number of mismatches
         positive means Number of positive-scoring matches
          gapopen means Number of gap openings
             gaps means Total number of gaps
-->             ppos means Percentage of positive-scoring matches
           frames means Query and subject frames separated by a '/'
           qframe means Query frame
           sframe means Subject frame
             btop means Blast traceback operations (BTOP)
          staxids means Subject Taxonomy ID(s), separated by a ';'
        sscinames means Subject Scientific Name(s), separated by a ';'
        scomnames means Subject Common Name(s), separated by a ';'
       sblastnames means Subject Blast Name(s), separated by a ';'
                (in alphabetical order)
       sskingdoms means Subject Super Kingdom(s), separated by a ';'
                (in alphabetical order)
           stitle means Subject Title
       salltitles means All Subject Title(s), separated by a '<>'
          sstrand means Subject Strand
            qcovs means Query Coverage Per Subject
          qcovhsp means Query Coverage Per HSP
   When not provided, the default value is:
   'qseqid sseqid pident length mismatch gapopen qstart qend sstart send
   evalue bitscore', which is equivalent to the keyword 'std'
   Default = `0'
ADD REPLYlink modified 18 days ago by RamRS25k • written 4.2 years ago by 5heikki8.6k

Thank you so much.

ADD REPLYlink written 4.2 years ago by Neha shri30

Sir, could you please clarify the significance of positive scoring matches. I searched through a bit, couldnt find anything. Would be grateful

ADD REPLYlink modified 18 days ago by RamRS25k • written 4.2 years ago by Neha shri30

Check these slides (7th page).

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by 5heikki8.6k

Link is not working.

ADD REPLYlink written 4.2 years ago by Neha shri30

should work now, http vs https :)

ADD REPLYlink written 4.2 years ago by 5heikki8.6k

Thank you :)

ADD REPLYlink written 4.2 years ago by Neha shri30
1
gravatar for cpad0112
4.2 years ago by
cpad011212k
India
cpad011212k wrote:
I guess you are talking about blastp. Identical residues are subset of similar residues. I am not sure if stand alone blast allows you do that.
ADD COMMENTlink written 4.2 years ago by cpad011212k

Yes its blastp I was talking about. Is there any other way to do it. I tried many online servers but could not aid me much.

ADD REPLYlink modified 18 days ago by RamRS25k • written 4.2 years ago by Neha shri30

My understanding is that OP was asking for extracting similar residues (excluding identical residues) from alignment, not just their % and/or number

ADD REPLYlink written 4.2 years ago by cpad011212k

Yes, you are absolutely right. But if the extraction could be done in the form of some score or percentage,it would be more helpful. Besides finding out the synonymous mutation in the sequences is my objective. For example- how many residues in each sequence in the database have undergone synonymous mutation and how similar they are still from the query sequence.

ADD REPLYlink modified 18 days ago by RamRS25k • written 4.2 years ago by Neha shri30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 757 users visited in the last hour