Entering edit mode
3.6 years ago
julianneradford
▴
20
Hi there! I am trying to assess genetic diversity in a set of VCF files via the PopGenome R package. My VCF files are not annotated.
I have bgzipped and indexed (tabix) my VCF files with the iTabixit app, which compresses then indexes VCF files. I have the .gz and .tbi files in the same folder. But when I run the command that imports the VCF to R:
genome.class <- readVCF("SRR5341585/.vcf.gz",numcols=100, tid="?", from=1, to= 1000000, approx=TRUE, out="", parallel=FALSE, gffpath=FALSE)
I get the following error:
open: No such file or directory
Caught exception inside whop_tabix::open('SRR5341585/.vcf.gz'):
'whop_tabix::open : Failed to open tabix index file'
return FALSE from whoptabix_open
vcff::open : could not open tabix-index!
VCF_open : Could not open file 'SRR5341585/.vcf.gz' as tabix-indexed!
I can confirm that I am in the correct directory, so I am not sure why this error is happening.
Any ideas on what the issue might be? Thanks!
It is unable to find a file name
'.vcf.gz'
in yourSRR5341585/
directory. Can you try providing the complete name of the file?Hello! Thanks for the reply. So, that was the name of the file, but I realize now that the "/" in the file name may have been throwing the program off. So I renamed the file SRR5341585.vcf.gz, and the command was executed. But now I am getting this error:
The tid parameter is supposed to be used as a chromosomal identifier, but since my VCF files are not annotated I do not have this info. The PopGen manual said to use a random character string, hence why I used "?". Did not seem to work the way it was supposed to.
Thanks again.
I figured out the solution to this. So I have to give an identifier into the "tid" parameter, and since I have multiple VCF files that are of different populations, the only identifier I can use when looking at one VCF file is the gene identifier.
After putting a gene ID into the tid parameter I was able to load the VCF in. However, I wish there was a way to compare diversity stats across different populations (stored in multiple VCF files). Is there a way to do this?