selection sweep. Sliding window
1
2
Entering edit mode
9.9 years ago
somakina ▴ 40

Hi

I will like to identify selection sweep using maf, can anyone help me with r script that can help me to pick up 7 contiguous snp with maf of equal or less than 0.01. (basic a sliding window to identify string of snp with less than or equal to 0.01 maf. My data look like this

Snp id              bta                 position                  maf

Any help will highly be appreciated thanks

sliding-window R selection-signature • 2.7k views
ADD COMMENT
0
Entering edit mode

I am not sure what the finding of 7 or whatever number of contigous low maf SNPs would indicate genetically. Can you explain why this observation (why 7?) would be relevant?

ADD REPLY
1
Entering edit mode

Thanks for your response. The number 7 is determined by the number monomorphic loci in the genome For example, if 15% of SNPs are monomorphic within a breed the probability that N contiguous SNPs are monomorphic is 0.15N, assuming independence, and in testing 52,942 SNP on 29 autosomes we would expect to find 0.15N x (52,942 - 29 x (N - 1)) regions in which N contiguous SNPs had fixed alleles. For N = 5 this corresponds to 4.0 false positives per breed but only 0.6 false positives per breed when N = 6. (Ramey et al 2013), thus number 7 in my case was because about 20 percent of snp are monomorphic in my breed.

So the Number of 7 contiguous loci spanning at least 200 kb and with a minor allele frequency ≤ 0.01 will be required to declare a selective sweep region in each breed.

thanks

ADD REPLY
0
Entering edit mode

Thank you for the detailed explanation. In this case the run length code below should work, it just needs a last filtering step to check whether each run covers a large enough region (comparing: pos(run-end) - pos(run-start) >= 200 kb).

ADD REPLY
0
Entering edit mode

Hi

Thanks for your response, however I have loaded the Iranges packages but when I run the runLenght function I get an error message that say unable to find an inherited method for function 'runLenght' for signature rle. Would you please assist me in running the runLenght package

Thanks in advance

ADD REPLY
0
Entering edit mode

Hi, it is 'runLength' not runLenght. Could you copy paste the source code I posted? This code is tested with R 3.0.1

ADD REPLY
0
Entering edit mode
snp                    bta    pos        maf
ARS-BFGL-NGS-19289     1      305793     0.00
ARS-BFGL-BAC-31497     1      905632     0.00
ARS-BFGL-BAC-36895     1      1078256    0.00
ARS-BFGL-NGS-87636     1      1603944    0.00
ARS-BFGL-BAC-754       1      1712261    0.00
ARS-BFGL-BAC-15552     1      1805584    0.00
ARS-BFGL-BAC-16253     1      1953465    0.00
ARS-BFGL-BAC-11152     1      2013659    0.00
ARS-BFGL-BAC-14982     1      2089647    0.00
BTB-00001299           1      2942947    0.00
ARS-BFGL-NGS-75879     1      3322254    0.00
ARS-BFGL-NGS-107481    2      3343059    0.00
BTA-35598-no-rs        2      5712202    0.00
BTB-01560572           2      5821018    0.00
BTB-01560539           2      5849779    0.00
ADD REPLY
0
Entering edit mode

Sure, could you please provide an example of your data with chromosome information, is there simply a chrom column added to the table? Please use 'edit' function to insert the example.

ADD REPLY
1
Entering edit mode
9.9 years ago
Michael 54k
## toy data with two runs of 7 mafs < 0.01:
df = data.framesnp.id=paste0("SNP",1:20), 
bta="buff", pos=1:20*100,maf=c(rep(0.009,7), 
rep(0.1,3),rep(0.0001,7), rep(0.02,3)))

df = df[order(df$pos),] # order data frame by position
maf0.01 = Rle(df$maf <= 0.01) # use run length encoding (Rle) to detect runs
rind = runLength(maf0.01) >= 7 & runValue(maf0.01) # select appropriate runs (len >= 7 and TRUE)
imat = cbind(start=start(maf0.01)[rind],end=end(maf0.01)[rind]) # index matrix for apply
apply(imat, 1, function(x) df[x[1]:x[2],]) # extract the selected runs from input data

output:


[[1]]
  snp.id  bta pos   maf
1   SNP1 buff 100 0.009
2   SNP2 buff 200 0.009
3   SNP3 buff 300 0.009
4   SNP4 buff 400 0.009
5   SNP5 buff 500 0.009
6   SNP6 buff 600 0.009
7   SNP7 buff 700 0.009

[[2]]
   snp.id  bta  pos   maf
11  SNP11 buff 1100 1e-04
12  SNP12 buff 1200 1e-04
13  SNP13 buff 1300 1e-04
14  SNP14 buff 1400 1e-04
15  SNP15 buff 1500 1e-04
16  SNP16 buff 1600 1e-04
17  SNP17 buff 1700 1e-04
ADD COMMENT

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6