Not enough SNPs after filtering an imputed dataset
1
1
Entering edit mode
4.6 years ago
Max ▴ 20

Hi,

I am using qctool to filter variants from an imputed dataset (from UK Biobank). The imputed data contains 93095623 autosomal SNPs. My list of rsid contains 1877 autosomal SNPs. Yet, the outcome only contains 822 SNPs. Would you know why that is (apologies if this is very obvious as I’m quite naive to genotype data)?

Many thanks, Max

Command:

qctool -g /path/to/inputgenotype_chr#.bgen -og /path/to/outputgenotype.gen -os /path/to/outputgenotype.sample
-incl-rsids /path/to/snpfile.txt

Outcome:

Welcome to qctool
(version: 2.0.6, revision afa3689)

(C) 2009-2017 University of Oxford

Opening genotype files                                      : [******************************] (22/22,-0.3s,-75.5/s)
========================================================================

Input SAMPLE file(s):         Output SAMPLE file:             "/path/to/outputgenotype.sample".
Sample exclusion output file:   "(n/a)".

Input GEN file(s):
                                                   (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr1.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr2.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr3.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr4.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr5.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr6.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr7.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr8.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr9.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr10.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr11.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr12.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr13.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr14.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr15.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr16.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr17.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr18.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr19.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr20.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr21.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr22.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (total 22 sources, number of snps not computed).
                     Number of samples: 487409

Output GEN file(s):             "/path/to/outputgenotype.gen"
Output SNP position file(s):    (n/a)
Sample filter:                  .
SNP filter:                     RSID in { set of 1877 }.

# of samples in input files:    487409.
# of samples after filtering:   487409 (0 filtered out).

========================================================================

Processing SNPs                                             :  (1881/?,2768.5s,0.7/s)
Total: 1881SNPs.
========================================================================

Number of SNPs:
                    -- in input file(s):                 (not computed).
-- in output file(s):                822

Number of samples in input file(s):   487409.

Output GEN files:                     (822    snps)  "/path/to/outputgenotype.gen"
                                     (total 822 snps).
Output SAMPLE files:                  "/path/to/outputgenotype.sample"  (487409 samples)
========================================================================


Thank you for using qctool.
impute SNP filtering variants qctool • 1.4k views
ADD COMMENT
1
Entering edit mode
4.6 years ago
Max ▴ 20

I tried saving the output as .bgen and it seems to have done the trick. I now have 1881 output SNPs (why not 1877, I don't know).

ADD COMMENT

Login before adding your answer.

Traffic: 1403 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6