Entering edit mode
4.1 years ago
serpalma.v ▴ 70
I evaluated the genome-wide LD decay by using snps separated by at least 10Kb using 'popLDdecay'.
In the figure, each dot represents the mean r^2 of all pairs of snps that are separated at a given distance. The maximum distance corresponds to the length of the longest chromosome.
The trend (general additive model) shows a decay in LD as expected, however, there are many sites with full LD (r^2 = 1) even if they are very far apart in the chromosome.
Since this is a mouse model, with an expected degree of inbreeding, could this explain the pattern described above?
Thanks in advance!
Mmmm... It is hard to understand what is going on. Theoretically, inbreeding will result in an increase of LD. However, let me say that your levels of LD are REALLY high! I don't know if eveny in a very large full-sib progeny you would see such a level of LD. For example, you have a large number of points in which LD is above 0.5 at a distance of 100Mb, which is quite uncommon. I never worked with highly inbred individuals, but I suggest you carefully check your data for inconsistencies or some problems. I am sorry I cannot help more!
Thank you for the input
I went through the steps I applied to the data and it seems to be in order. I was suggested to make sure that all sites that were not-polymorphic were removed (all samples 0/0 or 1/1), but the picture remains almost the same.
Previously I did PCA, ADMIXTURE, and hierarchical clustering (UPGMA) and all samples are grouped as expected (each mouse line has been selected for over 180 generations, while trying to minimize inbreeding). That gives me some confidence in that thus far the analysis was OK.
Would it be reasonable to use only SNPs that fall within a certain MAF range?
Yes, that would be reasonable. Usually people remove SNPs with low minor allele frequency from analysis. Although in my experience SNPs with low MAF causes decreases in the estimates of LD, oyu might try removing all SNPs with MAF lower than 0.1 (or other thresholds). Also, you could use some tool to measure IBD and/or relatedness between subjects; you could try plink or SNPrelate (the latter is an R package). At least you can try and understand better what is the reason for this behavior.