Question: randomly sequence cleavage - average length constraint
2
3.6 years ago by
France
stephaniepierson8320 wrote:

Dear all,

I would like to randomly cleave sequences of length (Ln) until the average length of the resulting framents is 50 nt (+/- 2 nt)

I started it with perl , but i've some problems with the average length constraint ...

I wanted to select a random position in [0 ... sequence length] and calculate the length of created segments. But i think it's not the right way to do.

Any suggestions ?

sequence next-gen perl • 1.1k views
modified 3.6 years ago by dylan.storey60 • written 3.6 years ago by stephaniepierson8320
1
3.6 years ago by
Hungary
Csaba Kerepesi320 wrote:

One perl solution if there is no overlap:

``````\$Ln=1000;
for (\$i=int(rand(50))+1;\$i<=\$Ln;\$i=\$i+50) {
printf "%d\n",\$i+1-int(rand(3));
}``````

` `
0
3.6 years ago by
RamRS19k
Houston, TX
RamRS19k wrote:

1. Allowable length = 48 .. 52
2. Iterate through each sequence. For each sequence,
• pick a random number (call it point) between 0 and len(seq)-1
• if len(seq)-1 - point >=52, pick substring 3' of point, with length randomly picked between 48 and 52
• add to a new list "pool_1" the sequence 5' of point and 3' of point+length picked above (these are the flanking fragments)
3. Repeat above operation on pool_1, this time picking substrings 5' of chosen point and adding the fragments to "pool_2".
4. Repeat until both pool_1 and pool_2 are filled with fragments less than 52 in total length

I can't put constraint on fragment length. They could be 1 nt or 2nt ...

0
3.6 years ago by
dylan.storey60
United States
dylan.storey60 wrote:

If you're concerned with performance , use unpack instead of substring.