Hi to all,
I would like to calculate pairwise LD for two given genomic locis (not rsIDs)?
Is it possible?
If you don't have any data, you could download genotypes from the 1000 Genomes Project with tabix (http://www.internationalgenome.org/category/tabix/) and then use Haploview to calculate the LD.
Thank you for your reply.
I have read that vcftools calculates pairwise LD through the arguments below from this website (https://vcftools.github.io/documentation.html#ld).
./vcftools --vcf input_data.vcf --hap-r2 --ld-window-bp 50000 --out ld_window_50000
If I get it right, input_data.vcf contains the genomic coordinates of interest.
But for which population does it calculates LD?
And do I need to download any data so that vcftools will utilize it during LD calculation?
I could not understand this part.
vcftools --gzvcf ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --chr 5 --from-bp 1000000 --to-bp 1100000 --out chr5_analysis --keep Samples.txt --hap-r2
You can use vcf files from 1000 Genomes Project (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/). However, these files have all the subjects from 1000. You need to create a list of samples you want to use.
Choose your samples from the file integrated_call_samples_v3.20130502.ALL.panel available with the data.
Do you mean that
vcftools will use "ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" file in order to calculate LD and
vcftools will calculate LD for all genomic coordinate pairs in --chr 5 --from-bp 1000000 --to-bp 1100000?
If yes, I want to provide genomic positions in a file instead of "--chr 5 --from-bp 1000000 --to-bp 1100000"?
And is there a way to calculate LD using output of WES data coming from parents and a child?
Or does it have to consist of a lot of samples?
1- Input file : If you have your own data (which you should always specify when you ask a question), replace "ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" with your vcf file.
2- LD : Yes, vcftools will calculate the LD for all coordinate pairs in the region. If you want the LD between two variants, create a vcf file with only those variants and it will work. The --chr, --from-bp and --to-bp options can be removed.
3- Parents and Child : No idea
Yes my main question right now is "Can we give parents and their child data in vcf format as input file to vcftools and calculate LD for various genomic loci using this input file?"
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy