Method For Randomly Selecting Subsequences In An Alignment ?
1
0
Entering edit mode
12.7 years ago
Spyros • 0

Hello BioStar Community,

I am working with a database storing motif sequences for proteins. The motifs (subsequences of primary sequences of GPCRs) have been excised from an iterative database scanning algorithm that determines the most conserved subsequences in a multiple sequence alignment, following certain criteria (such as length, whether motifs are allowed to overlap). Because I am going to be doing extensive work with these motifs, I need a way of demonstrating that they are truly non-random. I would like a method that randomly selects motifs (with similar constraints and criteria to the original ones). In this way, given a multiple sequence alignment I could compare the profile of the original motifs to that of the random ones to test whether or not they superimpose each other (they should). Can anyone suggest some way of approaching this?

sequence • 2.2k views
ADD COMMENT
0
Entering edit mode

What do you mean by randomly selecting motifs. Do you mean randomly selecting subsequences from your genome?

ADD REPLY
0
Entering edit mode

@I Albert: Yes, to prove that the motifs have been selected in a non-random way, I would like to have some way of randomly selecting subsequences from my sequence alignments (proteomic sequences) and repeat this process many times over.

ADD REPLY
0
Entering edit mode
12.7 years ago

The simplest approach may be to to write a simple code that loops through your sequences and cuts out substrings at random positions, something like this, needs to adapted to your needs:

from random import randint

# motif size
size = 5

stream = open('f1.fasta')
for id in stream:
    seq = stream.next()
    lo  = randint(0, len(seq)-size)
    print seq[lo: lo+size]
ADD COMMENT
0
Entering edit mode

@I Albert: Many thanks for that. A loop like that might work if I remove the headings and annotations from the ASCII file of the multiple sequence alignment!

ADD REPLY

Login before adding your answer.

Traffic: 1390 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6