emboss skipredundant issue
0
0
Entering edit mode
4.5 years ago

Hi all,

in continuation to the following post: skip redundant vs needle

I used needle command with the following sequences:

>1
MMMMMMMFKL

>2
MMMMMMMVYA

and got these results:

# Identity:       6/11 (54.5%)
# Similarity:     8/11 (72.7%)
# Gaps:           2/11 (18.2%)

In addition, when I used skipredundant command with threshold 75.0, the output file was:

>1
MMMMMMMFKL

As you can see the similarity between the sequences is 72.7%, so both had to appear in the output file.

Why does only the first sequence appear in the output file?

Thanks

alignment sequence • 797 views
ADD COMMENT
0
Entering edit mode

You are comparing two programs that are doing different things. needle is a global aligner so it is going to look at the entire length of sequences and produce a result.

With skipredundant you are trying to come up with a redundant dataset. Mode 1 (LINK for manual)

All permutations of pair-wise sequence alignments are calculated for each set of input sequences in turn using the EMBOSS implementation of the Needleman and Wunsch global alignment algorithm. Redundant sequences are removed in one of two modes as follows: (i) If a pair of proteins achieve greater than a threshold percentage sequence similarity (specified by the user) the shortest sequence is discarded. (ii) If a pair of proteins have a percentage sequence similarity that lies outside an acceptable range (specified by the user) the shortest sequence is discarded. (Values: 1 (Single threshold percentage sequence similarity); 2 (Outside a range of acceptable threshold percentage similarities))

One of these conditions is being satisfied.

Beyond this you will need to examine the actual code to see what exactly may be going on with these two tools.

ADD REPLY
0
Entering edit mode

This description is from the needle manual (LINK):

It uses the Needleman-Wunsch alignment algorithm to find the optimum alignment (including gaps) of two sequences along their entire length

As I understand, skipredundant uses the same technique as needle, doesn't it?

Furthermore, how can I take a look at the code?

ADD REPLY
0
Entering edit mode

It is using the same implementation of needle but which parameters it is using is what you may need to find in the source code. You can download the code for EMBOSS here.

Have you tried to compare these results with larger/real data sets? That may clarify things further.

ADD REPLY

Login before adding your answer.

Traffic: 3993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6