Question: Does sample size matter when calculating linkage disequilibrium using RAD-Seq loci?
14 months ago by
j.mcintyre.10 wrote:

Hi, I am fairly new to the field of bioinformatics and genetics! I've been analysing some ddRAD-Seq data using Stacks and am wanting to compute the linkage disequilibrium of my SNPs. I've used VCF Tools to do this. However, I've noticed that for different loci I have differing numbers of individuals included in the analysis.

This is because I have asked for a minimum of 80 % of individuals to be used in the Stacks output. But that means that up to 20% of individuals may not be represented at a locus. So, for example, individual A might have sequenced reads for locus (i) but not locus (ii) and so is not included in the linkage disequilibrium calculation for these two loci.

What I can't find out is if this matters?

If it does, do I need to select only loci represented in 100% of individuals, or is there a program which can accommodate the different numbers of individuals used for each locus?

Any help would be really appreciated! Thanks! Jenni

ADD COMMENTlink written 14 months ago by j.mcintyre.10
