2
5
Entering edit mode
2.6 years ago
60343011s ▴ 50

Hi all,

I have recently tried to estimate runs of homozygosity (ROH) from my vcf file by using plink 1.9.

I ran following code to generate binary files that plink required:

plink --vcf myfile.vcf --make-bed --out out_name --no-sex --no-parents --no-fid --no-pheno --allow-extra-chr


This vcf file only contains one individual and about 3 million SNPs.

I used --allow-extra-chr here because I mapped my sequences to a drift genome.

Then, I used following code (with default parameters), trying to estimate ROH of my sample:

plink -bfile out_name --homozyg --allow-extra-chr


The result gave me 0 ROH, and only header produced in .hom file.

I also tried different parameters with different SNP windows and criterions, such like:

plink -bfile out_name --homozyg --homozyg-window-snp 50 --homozyg-snp 50 --homozyg-window-missing 3 --homozyg-kb 100 --homozyg-density 1000 --allow-extra-chr


However, all the results were the same, that showed :

PLINK v1.90b6.12 64-bit (28 Oct 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Options in effect:

--allow-extra-chr
--bfile out_name
--homozyg
--homozyg-density 1000
--homozyg-kb 100
--homozyg-snp 50
--homozyg-window-missing 3
--homozyg-window-snp 50

515905 MB RAM detected; reserving 257952 MB for main workspace.
3708761 variants loaded from .bim file.
1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to plink.nosex .
Before main variant filters, 1 founder and 0 nonfounders present.
Calculating allele frequencies... done.
3708761 variants and 1 person pass filters and QC.
Note: No phenotypes present.
--homozyg: Scan complete, found 0 ROH.



Does anyone have idea why this happened to my files?

Will be grateful for any suggestions.

snp next-gen plink • 4.0k views
0
Entering edit mode

Heys, I'm struggling to having results from plink when I try to calculate the roh in my data. I did the snp calling both with gatk and with the code you left here (with --gvcf) but when I run plink, I keep getting the same message: --homozyg: Scan complete, found 0 ROH.

Could you upload how your final vcf looks like? I'm not sure how the homozygous positions should look like.

Any help would be appreciated!!

0
Entering edit mode

Hi Gubrins...Have you solved your problem. I am facing the same issue with plink v1.9. Could you please explain, how to resolve this issue.

0
Entering edit mode

Heys Adarsh, I'm sorry but I did not manage to solve it. I changed to bcftools and I'm calculating the ROHs there (https://samtools.github.io/bcftools/howtos/roh-calling.html). Good luck!

0
Entering edit mode

Thank you for good suggestion.

7
Entering edit mode
2.6 years ago

This is unsurprising if your VCF contains only positions where your sample differs from the reference genome; practically all the ROHs will span regions which are excluded by your VCF. You may need to reconstruct your VCF in a way that includes homozygous-REF calls in an unbiased manner.

1
Entering edit mode

Hi Chang, I used Haplotypecaller, CombinedGVCF, GenotypeGVCF, selectvariant and hard filtration to generate vcf files. My filtrated vcf files contains only 0/1 genotypes. How can I add 0/0 and 1/1 sites in it. Thanks

0
Entering edit mode

Hi, I think the answer could be here: Calculating ROHs and Heterozygosity from GATK

If I have well understood, the following flags can output all positions in your data set rather than just variants from the reference genome, and therefore enable the acquisition of ROH.

Haplotypecaller  -ERC BP_RESOLUTION
GenotypeGVCFs --included-non-variant-sites true

0
Entering edit mode

Yes, my VCF file contain only genotype 0/1 sites. I called those sites by

bcftools mpileup -a AD,DP -q 20 -f ref.fa input.bam | bcftools call -V indels -m -v -O v -o myfile.vcf


So I need to recall all the sites (0/0,0/1,1/1) by function like --gvcf under bcftools, right?

0
Entering edit mode

Just to update the results, I've successfully used a VCF file that contain all sites for ROH analysis. Thanks!

0
Entering edit mode

Hey, I have recently faced a similar question with you, When I tried to test the ROH analysis by using PLINK 1.9, the log file shows "Note: No phenotypes present", because my VCF files have a lot of "./.", I called the sites by samtools mpileup and bcftools, so could you please tell me how could I solve it? Adding "--gvcf" function? Thanks!

0
Entering edit mode

Hi .. Could you please explain me about this. How could you do that with gVCF file. I used Haplotypecaller, CombinedGVCF, GenotypeGVCF, selectvariant and hard filtration. My filtrated vcf files contains only 0/1 genotypes. How can I add 0/0 and 1/1 sites in it.

1
Entering edit mode
25 days ago
yzliu01 ▴ 10

Just in case it's still a common issue for others, I got the same issue and it worked if you down optimize the flag "--homozyg-kb 500", for example. It seems not to be necessary to reconstruct vcf files with something like 'Haplotypecaller -ERC BP_RESOLUTION '. Hope it's helpful.

0
Entering edit mode

Thanks Honey_man, I have used the same setting to lower the length threshold for which Plink consider a Run of Homozygosity.

Thanks to mention the 'Haplotypecaller -ERC BP_RESOLUTION ' or 'GVCF' detail, it has been a puzzle for me recently. I was wondering if Plink needs the non-variant reference positions or blocks.