Problem with vg autoindex with phased VCF
0
0
Entering edit mode
3 months ago

Hi, I'm trying to use vg autoindex with the human Chromosome 17 using:

  • Homo_sapiens.GRCh37.dna.chromosome.17.fa as REF (downloaded from Ensembl at 1)
  • ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz as Phased Variation File (downloaded from Ensembl at 2)

I'm using the 1.54.00 vg release "Parafada", running the command:

vg autoindex \
  --workflow giraffe \
  -r Homo_sapiens.GRCh37.dna.chromosome.17.fa \
  -v ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz \
  -p x \
  -V 2

The error I get is:

cannot setRegion on a non-tabix indexed file

And I think this happens inside the function HaplotypeIndexer::parse_vcf.

I also tried to input the .tbi file of the vcf, using -v file.vcf.gz.tbi, but I got this error:

[E::hts_hopen] Failed to open file ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz.tbi
[E::hts_open_format] Failed to open file "ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz.tbi" : Exec format error

Crash report for vg v1.54.0 "Parafada"
Stack trace (most recent call last) in thread 1649822:
5    Object "", at 0xffffffffffffffff, in 
4    Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7fbdafd8e352, in __clone Source "../sysdeps/unix/sysv/linux/x86_64/clone.S", line 95, in __clone [0x7fbdafd8e352]
3    Object "/usr/lib/x86_64-linux-gnu/libpthread-2.31.so", at 0x7fbdb04ea608, in start_thread Source "/build/glibc-wuryBv/glibc-2.31/nptl/pthread_create.c", line 477, in start_thread [0x7fbdb04ea608]
2    Object "/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0", at 0x7fbdafe9886d, in 
1    Object "/home/users/mirko.coggi/vg/bin/vg", at 0x55652eb5fd26, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, bool, bool)#4}::operator()(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, bool, bool) const [clone ._omp_fn.0] Source "src/index_registry.cpp", line 654, in _ZZN2vg9VGIndexes21get_vg_index_registryEvENKUlRKSt6vectorIPKNS_9IndexFileESaIS4_EEPKNS_12IndexingPlanERNS_10AliasGraphERKSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessISK_ESaISK_EEbbE2_clES8_SB_SD_SQ_bb._omp_fn.0 [0x55652eb5fd26]
0    Object "/home/users/mirko.coggi/vg/bin/vg", at 0x55652f2ec0a8, in bcf_hdr_read

My question here is double:

  1. Is there a way/workflow to autoindex these files?
  2. Is there the possibility of creating the GBWT without using the XG graph but only using the FASTA and the VCF?
vg • 342 views
ADD COMMENT
0
Entering edit mode

You wouldn't want to provide the .tbi index as a VCF. The convention is to have the .tbi file sitting in the same directory as the bgzipped VCF with the same name as the VCF except with .tbi appended (i.e. path/to/vcf_file_name.vcf.gz and path/to/vcf_file_name.vcf.gz.tbi). If you do that, HTSLib should find it automatically.

ADD REPLY
0
Entering edit mode

Yes, now it works, thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6