Question: Not enough SNPs after filtering an imputed dataset
1
gravatar for Max
3 months ago by
Max20
Max20 wrote:

Hi,

I am using qctool to filter variants from an imputed dataset (from UK Biobank). The imputed data contains 93095623 autosomal SNPs. My list of rsid contains 1877 autosomal SNPs. Yet, the outcome only contains 822 SNPs. Would you know why that is (apologies if this is very obvious as I’m quite naive to genotype data)?

Many thanks, Max

Command:

qctool -g /path/to/inputgenotype_chr#.bgen -og /path/to/outputgenotype.gen -os /path/to/outputgenotype.sample
-incl-rsids /path/to/snpfile.txt

Outcome:

Welcome to qctool
(version: 2.0.6, revision afa3689)

(C) 2009-2017 University of Oxford

Opening genotype files                                      : [******************************] (22/22,-0.3s,-75.5/s)
========================================================================

Input SAMPLE file(s):         Output SAMPLE file:             "/path/to/outputgenotype.sample".
Sample exclusion output file:   "(n/a)".

Input GEN file(s):
                                                   (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr1.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr2.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr3.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr4.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr5.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr6.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr7.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr8.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr9.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr10.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr11.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr12.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr13.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr14.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr15.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr16.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr17.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr18.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr19.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr20.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr21.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (not computed)  "snp-id-data-filtered:/path/to/inputgenotype_chr22.bgen (bgen v1.2; 487409 unnamed samples; zlib compression)"
                                        (total 22 sources, number of snps not computed).
                     Number of samples: 487409

Output GEN file(s):             "/path/to/outputgenotype.gen"
Output SNP position file(s):    (n/a)
Sample filter:                  .
SNP filter:                     RSID in { set of 1877 }.

# of samples in input files:    487409.
# of samples after filtering:   487409 (0 filtered out).

========================================================================

Processing SNPs                                             :  (1881/?,2768.5s,0.7/s)
Total: 1881SNPs.
========================================================================

Number of SNPs:
                    -- in input file(s):                 (not computed).
-- in output file(s):                822

Number of samples in input file(s):   487409.

Output GEN files:                     (822    snps)  "/path/to/outputgenotype.gen"
                                     (total 822 snps).
Output SAMPLE files:                  "/path/to/outputgenotype.sample"  (487409 samples)
========================================================================


Thank you for using qctool.
ADD COMMENTlink modified 3 months ago • written 3 months ago by Max20
1
gravatar for Max
3 months ago by
Max20
Max20 wrote:

I tried saving the output as .bgen and it seems to have done the trick. I now have 1881 output SNPs (why not 1877, I don't know).

ADD COMMENTlink written 3 months ago by Max20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1761 users visited in the last hour