Question: Perl script required
0
gravatar for Rasoul Godini
5 months ago by
Rasoul Godini0 wrote:

Hi everyone,

I need an script to select specific sequences among many results of a local blast output file. Briefly, I have an blast output file containing many alignments in different frame shifts with E-value and bit scores. Each sequence has different frame shift results. Now, I want to select just the frame shifts with the highest score in each alignment and the best alignments and based bit score or E-value. No limitation for the type of file.
Anyone can introduce me a way to learn how prepare such scripts as I am new in Perl scripting, or if there is a file to be edited for my goal.
Thank you in advance.

sequence alignment • 215 views
ADD COMMENTlink modified 5 months ago by Shred60 • written 5 months ago by Rasoul Godini0
1

And why does it have to be Perl?

ADD REPLYlink written 5 months ago by WouterDeCoster32k

Why not put the whole result in a pandas dataframe in Python, then filter it as you want ? Not easy to create a dataframe in Perl btw

ADD REPLYlink written 5 months ago by Bastien Hervé2.2k
3
gravatar for Shred
5 months ago by
Shred60
Shred60 wrote:

Don't need to be in perl. If you're working with tab separated values (so outfmt 6) , you can easly do a filter with awk. Assuming evalue is on the 4th column, thìs command will print only lines where evalue is lower than 0.02

awk -F'\t' '$4 < 0.02 {print ;}'

Same things could be done with every params. Remember always to declare which is the field separator, said after "-F" arguments. Field starts to be counted from $1, because $0 is the whole line.

To get just the first results, assuming your blast query reports output from most representative record to the less representative one, you can do again a trick with awk. $1 is used assuming your query id is on the 1st field.

awk '!seen[$1]++'
ADD COMMENTlink written 5 months ago by Shred60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1835 users visited in the last hour