Vcftools Runs Of Homozygosity Option
1
3
Entering edit mode
12.0 years ago
Rubal7 ▴ 830

Hi all,

Does anybody understand how the --LROH option works in vcf tools? It is designed to detect long runs of homozygosity, which is exactly what I want to do, but without knowing more about the underlying assumptions it makes I am hesitant to use it. There is no clear explanation of how to interpret the output files. Any help or pointers greatly appreciated. Thank you in advance.

Cheers,

Rubal

vcf • 7.3k views
ADD COMMENT
0
Entering edit mode

Hi,

I'm running this command:

vcftools --LROH --gzvcf file.vcf.gz   --out file --chr 1

and getting this log on the screen:

VCFtools - v0.1.13
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
    --gzvcf 0200092201_S3_L002.recalibrate_BOTH.vcf.gz
    --chr 1
    --LROH
    --out /gpfs/projects/afadda/0200092201-chr1

Using zlib version: 1.2.3
Versions of zlib >= 1.2.4 will be *much* faster when reading zipped VCF files.
After filtering, kept 1 out of 1 Individuals
Outputting Long Runs of Homozygosity (Experimental)... 
    0200092201_S3_L002
After filtering, kept 50811 out of a possible 579645 Sites
Run Time = 8.00 seconds

But the output file contains only the header. Even if I do --stdout I also get the header only.

CHROM   AUTO_START  AUTO_END    MIN_START   MAX_END N_VARIANTS_BETWEEN_MAX_BOUNDARIES   N_MISMATCHES    INDV

I don't know what's the problem.

Thanks!

ADD REPLY
0
Entering edit mode

Hi,

I am having the same problem. After running the -LROH command I get only the header line in output file. Can someone please explain on this.

Thank you

Best
Rangi

ADD REPLY
0
Entering edit mode

Hello! I too am running into the same issue! Can anyone explain this?!!

Thank you!

ADD REPLY
0
Entering edit mode

It might be helpful depending on your chromosome names to include quotes around them as in, --chr "chromname". I hope that helps!

ADD REPLY
2
Entering edit mode
12.0 years ago
Adam ★ 1.0k

The LROH option implements the algorithm described in Auton et al., Genome Research, 2009, although using the Forward-backwards algorithm in place of the Viterbi. Unfortunately, the option in vcftools is very experimental at the moment, and isn't really ready for "prime time" (hence being labelled under "Options still in development"). Specifically, the option is very slow to run on any dataset of reasonable size, and I therefore don't recommend the option for regular use.

Regards, Adam

ADD COMMENT
0
Entering edit mode

thanks for this!

ADD REPLY
0
Entering edit mode

Thanks Adam. I have tested the LROH option on some non-human data and got some interesting results that I want to investigate further. Is there any reason to believe that running the LROH option would be problematic for mouse or rat genomes, for instance if it incorporates assumptions about recombination rate based on a human reference? Just want to check because it's not clear exactly how VCFTools implements the Auton et al., algorithm. Best, Rubal

ADD REPLY
0
Entering edit mode

what's your command line look like? I can't seem to get any output data using "vcftools --vcf sorted.vcf --LROH --chr chr1"

ADD REPLY
0
Entering edit mode

If I have interpreted the Auton et al. 2009 paper correctly, the algorithm will find ROH at least 1cM in length that have at least 50 SNPs. Is the number of SNPs cumulative across individuals, such that 50 individuals could have a single (different) SNP in this region? This is the only solution that comes to mind to explain my output file, which looks like this (truncated, of course):

CHROM   AUTO_START      AUTO_END        N_VARIANTS      INDV
chr6    3192774 7941907 85      M103
chr6    32714130        32714172        1       M103
chr6    32965687        33317372        9       M103
chr6    35146229        35975515        54      M103
chr6    18166887        18166904        8       M105r
chr6    18166908        18526735        19      M105
chr6    20216335        20216336        1       M113  ## this line with the small interval
chr6    20216340        20216341        1       M113
chr6    20216343        20216345        1       M113
chr6    20216350        20216356        1       M113

It's not clear to me why some of the regions (e.g., the line with the comment) are so small.

Thanks,
Loren

ADD REPLY
0
Entering edit mode

The option in VCFtools just outputs all regions where the probability of autozygosity is about 0.99. It is left to the user to filter these events as required.

I've actually made a few changes to the LROH option in the past couple of days, so the function should be a little more informative (and run slightly more efficiently). These changes are now available in the SVN version.

ADD REPLY

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6