So I have managed to successfully query 1000 Genome VCF files before by following tutorials, I thought I could apply this to the COSMIC database. I can't.
I'm trying to make use of some WGS data from COSMIC. As I don't know the file structure I ran the following in an attempt to view the headers:
tabix -H ftp://ngs.sanger.ac.uk/production/cosmic/wgs/CosmicCodingMuts_v64_26032013_noLimit_wgs.vcf.gz [get_local_version] downloading the index file... [kftp_connect_file] 550 No such file. [download_from_remote] fail to open remote file. [tabix] failed to load the index file.
...so then I downloaded the file:
tabix -H CosmicCodingMuts_v64_26032013_noLimit_wgs.vcf.gz [tabix] the index file either does not exist or is older than the vcf file. Please reindex.
How do I reindex a vcf file? I tried this (I don't know what I'm doing at all, sorry):
tabix -f CosmicCodingMuts_v64_26032013_noLimit_wgs.vcf.gz [tabix] was bgzip used to compress this file? CosmicCodingMuts_v64_26032013_noLimit_wgs.vcf.gz
....and so in a last ditch attempt I tried just running vcftools (I thought this would filter out any mutations not on X):
vcftools --gzvcf CosmicCodingMuts_v64_26032013_noLimit_wgs.vcf.gz --chr X --out test VCF index is older than VCF file. Will regenerate. Building new index file. Reading Index file. File contains 641910 entries and 0 individuals. Filtering by chromosome. (list of chromosomes) Skipping Remainder. Keeping 28481 entries on specified chromosomes. Applying Required Filters. After filtering, kept 0 out of 0 Individuals After filtering, kept 28481 out of a possible 28481 Sites
So didn't work as expected either. I then tried running tabix again but it still gives me the error that an index file doesn't exist.
Are there any read me files or guides that I've just not been able to find? I literally have no idea what I'm doing.