Question about ROH analysis by Plink 1.9
1
4
Entering edit mode
22 months ago
60343011s ▴ 40

Hi all,

I have recently tried to estimate runs of homozygosity (ROH) from my vcf file by using plink 1.9.

I ran following code to generate binary files that plink required:

plink --vcf myfile.vcf --make-bed --out out_name --no-sex --no-parents --no-fid --no-pheno --allow-extra-chr

This vcf file only contains one individual and about 3 million SNPs.

I used --allow-extra-chr here because I mapped my sequences to a drift genome.

Then, I used following code (with default parameters), trying to estimate ROH of my sample:

plink -bfile out_name --homozyg --allow-extra-chr

The result gave me 0 ROH, and only header produced in .hom file.

I also tried different parameters with different SNP windows and criterions, such like:

plink -bfile out_name --homozyg --homozyg-window-snp 50 --homozyg-snp 50 --homozyg-window-missing 3 --homozyg-kb 100 --homozyg-density 1000 --allow-extra-chr

However, all the results were the same, that showed :

PLINK v1.90b6.12 64-bit (28 Oct 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:

  --allow-extra-chr
  --bfile out_name
  --homozyg
  --homozyg-density 1000
  --homozyg-kb 100
  --homozyg-snp 50
  --homozyg-window-missing 3
  --homozyg-window-snp 50

515905 MB RAM detected; reserving 257952 MB for main workspace.
3708761 variants loaded from .bim file.
1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to plink.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 1 founder and 0 nonfounders present.
Calculating allele frequencies... done.
3708761 variants and 1 person pass filters and QC.
Note: No phenotypes present.
--homozyg: Scan complete, found 0 ROH.

Results saved to plink.hom + plink.hom.indiv + plink.hom.summary .

Does anyone have idea why this happened to my files?

Will be grateful for any suggestions.

snp next-gen plink • 2.2k views
ADD COMMENT
0
Entering edit mode

Heys, I'm struggling to having results from plink when I try to calculate the roh in my data. I did the snp calling both with gatk and with the code you left here (with --gvcf) but when I run plink, I keep getting the same message: --homozyg: Scan complete, found 0 ROH.

Could you upload how your final vcf looks like? I'm not sure how the homozygous positions should look like.

Any help would be appreciated!!

ADD REPLY
6
Entering edit mode
22 months ago

This is unsurprising if your VCF contains only positions where your sample differs from the reference genome; practically all the ROHs will span regions which are excluded by your VCF. You may need to reconstruct your VCF in a way that includes homozygous-REF calls in an unbiased manner.

ADD COMMENT
0
Entering edit mode

Yes, my VCF file contain only genotype 0/1 sites. I called those sites by

bcftools mpileup -a AD,DP -q 20 -f ref.fa input.bam | bcftools call -V indels -m -v -O v -o myfile.vcf

So I need to recall all the sites (0/0,0/1,1/1) by function like --gvcf under bcftools, right?

ADD REPLY
0
Entering edit mode

Just to update the results, I've successfully used a VCF file that contain all sites for ROH analysis. Thanks!

ADD REPLY
0
Entering edit mode

Hey, I have recently faced a similar question with you, When I tried to test the ROH analysis by using PLINK 1.9, the log file shows "Note: No phenotypes present", because my VCF files have a lot of "./.", I called the sites by samtools mpileup and bcftools, so could you please tell me how could I solve it? Adding "--gvcf" function? Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1464 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6