qctool to merge two bgen file fails with no clear reason to
0
0
Entering edit mode
2.2 years ago
kpatil ▴ 50

Hi,

I am trying to merge two bgen files using qctool as explained here. I am using qctool_v2.2.0.

The command works but ends with an error:

❱ qctool -g bug/in2.bgen -s bug/in2.sample -merge-in bug/in1.bgen bug/in1.sample -og bla.bgen -os bla.sample

Welcome to qctool
(version: 2.2.0, revision: unknown)

(C) 2009-2020 University of Oxford

Opening genotype files                                      : [******************************] (1/1,1.0s,1.0/s)
========================================================================

Input SAMPLE file(s):           "bug/in2.sample"
Output SAMPLE file:             "bla.sample".
Sample exclusion output file:   "(n/a)".

Input GEN file(s):
                                             Spec: merge:chain:bug/in2.bgen (bgen v1.2; 487409 named samples; zlib compression),chain:bug/in1.bgen (bgen  v1.2; 487409 named samples; zlib compression)
                     Number of samples: 487409
                        Number of SNPs: 24
Output GEN file(s):             "bla.bgen"
Output SNP position file(s):    (n/a)
Sample filter:                  .
# of samples in input files:    487409.
# of samples after filtering:   487409 (0 filtered out).

========================================================================

Processing SNPs                                             :  (0/?,0.0s,0.0/s)terminate called after throwing an instance of 'genfile::bgen::BGenError'
  what():  BGenError

The two sample files are exactly the same:

diff bug/in1.sample bug/in2.sample 

Shows no differences.

I used bgen-reader to check the files and there seems to be nothing really off.

In [20]: (np.array(bgen2.samples) == np.array(bgen.samples)).all()
Out[20]: True

In [21]: bgen.rsids
Out[21]: memmap(['rs524513'], dtype='<U8')

In [22]: bgen2.rsids
Out[22]:
memmap(['rs213026', 'rs12026171', 'rs4090391', 'rs3181077', 'rs2517455',
        'rs3129932', 'rs9271117', 'rs9274477', 'rs2859090', 'rs3117221',
        'rs3923809', 'rs9444828', 'rs473267', 'rs6993992', 'rs16911668',
        'rs10995245', 'rs7122887', 'rs12322530', 'rs1154153', 'rs1154155',
        'rs1263647', 'rs12148472', 'rs3825932'], dtype='<U10')

Probably the only "odd" thing is that one of the files contains a single SNP.

I will really appreciate any hints to get this working.

Many thanks!

EDIT

The problem seems to be the "larger" file with 23 SNPs which was created by mergeing several bgen files (9 to be accurate) one by one.

EDIT2

The merge works using cat-bgen, so the issue seems to be with qctool?

❱ cat-bgen -g bug/in1.bgen bug/in2.bgen -og bla.bgen

Welcome to cat-bgen

(C) 2009-2017 University of Oxford

Adding file "bug/in1.bgen" (1 of 2, 1 variants)...
Adding file "bug/in2.bgen" (2 of 2, 23 variants)...
Finished writing "bla.bgen" (487409 samples, 24 variants).

Thank you for using cat-bgen.
qctool bgen • 835 views
ADD COMMENT

Login before adding your answer.

Traffic: 1521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6