Question: how to remove the repeat genes or get the best hsp mathces
0
gravatar for bio90029
19 months ago by
bio9002910
bio9002910 wrote:

Hi, I have performed a blast following the below code:

blastn_cline=NcbiblastnCommandline(query='CDS_extracted_file.fa',db=output + '/temporary_db',evalue=0.001, gapopen=0,gapextend=2,outfmt=5,out=output +'/blast_file.xml')

I have released that some of the genes are repeating as for example:

gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 93.6329588015
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 91.7602996255
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|86 NODE_87_length_35482_cov_21.696409 number_gaps: 0  per_identity: 93.

How I can modify the script so I only get the best one? Thanks

blast gene • 413 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by bio9002910
1

"max_hsps" option may work if your wrapper has that option. https://www.ncbi.nlm.nih.gov/books/NBK279675/

ADD REPLYlink written 19 months ago by fishgolden360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1640 users visited in the last hour