Question: how to remove the repeat genes or get the best hsp mathces
0
gravatar for bio90029
2.2 years ago by
bio9002910
bio9002910 wrote:

Hi, I have performed a blast following the below code:

blastn_cline=NcbiblastnCommandline(query='CDS_extracted_file.fa',db=output + '/temporary_db',evalue=0.001, gapopen=0,gapextend=2,outfmt=5,out=output +'/blast_file.xml')

I have released that some of the genes are repeating as for example:

gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 93.6329588015
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 91.7602996255
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|86 NODE_87_length_35482_cov_21.696409 number_gaps: 0  per_identity: 93.

How I can modify the script so I only get the best one? Thanks

blast gene • 468 views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by bio9002910
1

"max_hsps" option may work if your wrapper has that option. https://www.ncbi.nlm.nih.gov/books/NBK279675/

ADD REPLYlink written 2.2 years ago by fishgolden420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2623 users visited in the last hour