Get matrix, barcodes and features from H5 file
0
0
Entering edit mode
2.6 years ago
jonathanpa12 ▴ 10

Hello everyone

I'm trying to extract the matrix, barcodes and features from a H5 file following the python code of https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/h5_matrices:

import collections
import scipy.sparse as sp_sparse
import tables

CountMatrix = collections.namedtuple('CountMatrix', ['feature_ref', 'barcodes', 'matrix'])

def get_matrix_from_h5(filename):
    with tables.open_file(filename, 'r') as f:
        mat_group = f.get_node(f.root, 'matrix')
        barcodes = f.get_node(mat_group, 'barcodes').read()
        data = getattr(mat_group, 'data').read()
        indices = getattr(mat_group, 'indices').read()
        indptr = getattr(mat_group, 'indptr').read()
        shape = getattr(mat_group, 'shape').read()
        matrix = sp_sparse.csc_matrix((data, indices, indptr), shape=shape)

        feature_ref = {}
        feature_group = f.get_node(mat_group, 'features')
        feature_ids = getattr(feature_group, 'id').read()
        feature_names = getattr(feature_group, 'name').read()
        feature_types = getattr(feature_group, 'feature_type').read()
        feature_ref['id'] = feature_ids
        feature_ref['name'] = feature_names
        feature_ref['feature_type'] = feature_types
        tag_keys = getattr(feature_group, '_all_tag_keys').read()
        for key in tag_keys:
            feature_ref[key] = getattr(feature_group, key).read()

        return CountMatrix( matrix)

filtered_matrix_h5 = "GSM4785601_P6_2.filtered_feature_bc_matrix.h5"
filtered_feature_bc_matrix = get_matrix_from_h5(filtered_matrix_h5)

but I'm getting this error. Can someone help me please?

Traceback (most recent call last):
  File "/home/jonathan/extracth5.py", line 32, in <module>
    filtered_feature_bc_matrix = get_matrix_from_h5(filtered_matrix_h5)
  File "/home/jonathan/extracth5.py", line 27, in get_matrix_from_h5
    feature_ref[key] = getattr(feature_group, key).read()
TypeError: getattr(): attribute name must be string

Thank you in advance

H5 matrix • 1.1k views
ADD COMMENT
0
Entering edit mode

I am not entirely sure about this but can you try str(key) on line number 27 in your code feature_ref[key] = getattr(feature_group, key).read().

NOTE: I am suggesting this purely based on error reported in your error log.

ADD REPLY
0
Entering edit mode

I tried it, but I didn't work.

ADD REPLY

Login before adding your answer.

Traffic: 2410 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6