Question: Extracting SNP status using QCTOOL from an impute2 output
gravatar for nadia.lipunova
18 months ago by
nadia.lipunova0 wrote:

Hi everyone!

I am doing a GWAS with imputed genotypes (phased with Eagle, imputed with Impute2), that resulted in a format similar to this:

--- rs17149628:126006965:C:T 126006965 C T

After imputation, an analysis using SNPTEST went absolutely fine, without producing errors.

However, I am now trying to extract certain SNP status for all individuals using QCTOOL and it seems there's an issue when I try to specify the rsID (second field, i.e. rs17149628:126006965:C:T) as no output is produced (however no errors are listed s). My guess is QCTOOL reads ":" as a field separator and isn't able to treat it as a SNP. Would anyone know how to overcome the problem? The whole log is as follows:

 qctool -g data.impute2 -s sample.txt -incl-rsids snp.txt -og extract.txt
     (C) 2009-2011 University of Oxford
    Opening genotype files                       :
    Input SAMPLE file(s):           "sample.txt" Output
    SAMPLE file:             "(n/a)". Sample statistic output file:  
    "(n/a)". Sample exclusion output file:   "(n/a)".
    Input GEN file(s):
   (not computed)  "snp-id-data-filtered:data.impute2"  (total 1 sources, number of snps not computed).
     Number of samples: 653 Output GEN file(s):  "extract.txt" Output SNP position file(s):    (n/a) SNP statistic
    output file(s):  Sample filter:                  (none). SNP filter:

    # of samples in input files:    653.
    # of samples after filtering:   653 (0 filtered out).
    Processing SNPs                              :  (1/?,21.2s,0.0/s)

    Number of SNPs:
                         -- in input file(s):                 (not computed).  -- in output file(s):                1

    Number of samples in input file(s):   653.

    Output GEN files:                     (1      snps) 
                                          (total 1 snps).
    Thank you for using qctool.

So, all seems fine, but the extract.txt is empty. I am becoming desperate here, would anyone know what/why is happening and how to overcome it? I'll be thankful for any comments! Nad

snptest qctool snp impute2 gwas • 1.2k views
ADD COMMENTlink modified 18 months ago by pfs270 • written 18 months ago by nadia.lipunova0
gravatar for pfs
18 months ago by
pfs270 wrote:

In am not entirely sure what you want to do but do you need the -s argument. It may also be beneficial to change the data.impute2 filename to include the file type extension. Does this work? qctool -g data.impute2(.gen) -incl-rsids snp.txt -og sample.txt

ADD COMMENTlink written 18 months ago by pfs270

That's true, it runs without the sample file as well, however the same thing happens (also with .gen extension): it shows all went went and the Output GEN has one snp, but it is actually empty.

ADD REPLYlink written 18 months ago by nadia.lipunova0

So there could be an issue with your gen file or the snp.txt file. Check to make sure you do not have any additional spaces in the snps.txt file. Try running the qctool with out sub setting particular SNPs (qctool -g example_#.gen -snp-stats). If the log indicates a number of SNPs your issue is with the snps.txt file. If there is still no SNP count try manually changing 'rs17149628:126006965:C:T' to just rs17149628 and then re-running. You may run into issues because the rsID is not always a unique identifier in gen files. If this does not fix your problem you may need to insert unique SNP values into the first column for each row. Here is the file format they want ( If I recall correctly there are typically three additional files associated with the gen file (.haps/.sample/.metrics). These files may need to be in the same directory. That is all I got, I am not a qctool user.

ADD REPLYlink written 18 months ago by pfs270

The problem persists after trying all of the following: a) including just the rs number instead of whole field (i.e. rs17149628 instead of rs17149628:126006965:C:T); b) requesting the SNP with specific position (i.e. -incl-positions 126006965); c) the snp list formatted as a list as well as a line with whitespace as delimiters.

Whenever a genotyped SNP (e.g. rs17149628 C T) is requested, the QCTOOL runs smoothly, recognising all fields correctly. I am pretty sure it's the ":" messing with interpretation, so probably will turn to Linux data management to solve the issue. Nevertheless, thank you!

ADD REPLYlink written 18 months ago by nadia.lipunova0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1082 users visited in the last hour