Question: Perl script required
0
gravatar for Rasoul Godini
11 days ago by
Rasoul Godini0 wrote:

Hi everyone,

I need an script to select specific sequences among many results of a local blast output file. Briefly, I have an blast output file containing many alignments in different frame shifts with E-value and bit scores. Each sequence has different frame shift results. Now, I want to select just the frame shifts with the highest score in each alignment and the best alignments and based bit score or E-value. No limitation for the type of file.
Anyone can introduce me a way to learn how prepare such scripts as I am new in Perl scripting, or if there is a file to be edited for my goal.
Thank you in advance.

sequence alignment • 124 views
ADD COMMENTlink modified 11 days ago by danilo.tatoni60 • written 11 days ago by Rasoul Godini0
1

And why does it have to be Perl?

ADD REPLYlink written 11 days ago by WouterDeCoster28k

Why not put the whole result in a pandas dataframe in Python, then filter it as you want ? Not easy to create a dataframe in Perl btw

ADD REPLYlink written 11 days ago by Bastien Hervé1.1k
3
gravatar for danilo.tatoni
11 days ago by
danilo.tatoni60 wrote:

Don't need to be in perl. If you're working with tab separated values (so outfmt 6) , you can easly do a filter with awk. Assuming evalue is on the 4th column, thìs command will print only lines where evalue is lower than 0.02

awk -F'\t' '$4 < 0.02 {print ;}'

Same things could be done with every params. Remember always to declare which is the field separator, said after "-F" arguments. Field starts to be counted from $1, because $0 is the whole line.

To get just the first results, assuming your blast query reports output from most representative record to the less representative one, you can do again a trick with awk. $1 is used assuming your query id is on the 1st field.

awk '!seen[$1]++'
ADD COMMENTlink written 11 days ago by danilo.tatoni60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 688 users visited in the last hour