5.5 years ago by
- I'm no statistician but the chance is probably very high;
4^4 = 256 so about a 1 in 256 chance to find the string back. So if you think in terms of kmer's if you take a kmer of 4 and then the chance of one of these kmers being your random string is 1 in 256. So it will probably be something like (N-3)/256 chance to map your random string.
- 1 in 256 so the prob to uniquely map is 1 - ((N-3)/256), so it will probably have a negative probability due to the fact that it will almost definitely map on multiple locations.
- Reads are normally at least 100bp (although there are sequencers that go down to 28bp), so their probability to map to a reference are more in the range of 4^100 = 1,6E+60. The reverse read will not increase the probability of the first read mapping uniquely. However you now have two paired reads that should have mapped closely (depending on your insert size) together and should both have mapped only once. So it does not increase the probability of the first read to map uniquely, but it increases the confidence with which you can say it mapped uniquely.
Like i said this is just a super basic guesstimate. There are a lot of factors that play a real role, if your 4 letter random string contains more GCs it will change the probability of mapping. If there are repeats it could affect the mapping prob etc etc.