randomly sequence cleavage - average length constraint
3
2
Entering edit mode
7.7 years ago
Nifaste ▴ 20

Dear all,

I would like to randomly cleave sequences of length (Ln) until the average length of the resulting fragments is 50nt (+/- 2nt)

I started it with perl, but I've some problems with the average length constraint ...

I wanted to select a random position in [0 ... sequence length] and calculate the length of created segments. But I think it's not the right way to do.

Any suggestions?

perl next-gen sequence • 1.8k views
ADD COMMENT
1
Entering edit mode
7.7 years ago

One perl solution if there is no overlap:

$Ln=1000;
for ($i=int(rand(50))+1;$i<=$Ln;$i=$i+50) {
        printf "%d\n",$i+1-int(rand(3));
}
ADD COMMENT
0
Entering edit mode
7.7 years ago
Ram 37k

I'd follow this approach:

  1. Allowable length = 48 .. 52
  2. Iterate through each sequence. For each sequence,
    • if len(seq)<52, skip to next sequence. Else,
    • pick a random number (call it point) between 0 and len(seq)-1
    • if len(seq)-1 - point >=52, pick substring 3' of point, with length randomly picked between 48 and 52
    • add to a new list "pool_1" the sequence 5' of point and 3' of point+length picked above (these are the flanking fragments)
  3. Repeat above operation on pool_1, this time picking substrings 5' of chosen point and adding the fragments to "pool_2".
  4. Repeat until both pool_1 and pool_2 are filled with fragments less than 52 in total length
ADD COMMENT
0
Entering edit mode

I can't put constraint on fragment length. They could be 1 nt or 2nt ...

ADD REPLY
0
Entering edit mode
7.7 years ago
dylan.storey ▴ 60

If you're concerned with performance, use unpack instead of substring.

ADD COMMENT

Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6