How to import filtered_tf_bc_matrix and add it to my anndata object?
0
0
Entering edit mode
4 months ago
bioinfo ▴ 140

Hello,

I am analyzing some ATAC samples and I would like to add the motif information to my objects. So far I have imported my fragment files and clustered the cells. I would like to add the filtered_tf_bc_matrix information so I can do differential expression based on the motifs but I am having trouble figuring out how to do that. I tried to import the motif file as shown below:

# tf-bc matrix

matrix_dir = "filtered_tf_bc_matrix"
mat = scipy.io.mmread(os.path.join(matrix_dir, "matrix.mtx.gz"))

motifs_path = os.path.join(matrix_dir, "motifs.tsv")
motif_ids = [row[0] for row in csv.reader(open(motifs_path), delimiter="\t")]
motif_names = [row[1] for row in csv.reader(open(motifs_path), delimiter="\t")]
barcodes_path = os.path.join(matrix_dir, "barcodes.tsv.gz")
barcodes = [row[0] for row in csv.reader(gzip.open(barcodes_path, mode="rt"), delimiter="\t")]
# transform table to pandas dataframe and label rows and columns
matrix = pd.DataFrame.sparse.from_spmatrix(mat)
matrix.columns = barcodes
matrix.insert(loc=0, column="motif_names", value=motif_names)
matrix.insert(loc=0, column="motif_ids", value=motif_ids)


# display matrix
print(matrix)
# save the table as a CSV (note the CSV will be a very large file)
matrix.to_csv("mex_matrix.csv", index=False)

I then tried to create the anndata file by doing adatamotif = ad.AnnData(matrix) but the object does not seem correct.

Is there another way I can add the motif information to my anndata or anndataset object?

The anndataset object I have looks like this:

AnnDataSet object with n_obs x n_vars = 20028 x 526765 backed at 'All_samples.h5ads'
contains 7 AnnData objects with keys: '1_fragments.tsv.gz', 2_fragments.tsv.gz', '3_fragments.tsv.gz', '4_fragments.tsv.gz''
    obs: 'sample', 'leiden'
    var: 'count', 'selected'
    uns: 'reference_sequences', 'AnnDataSet', 'spectral_eigenvalue'
    obsm: 'X_umap', 'X_spectral'
    obsp: 'distances'

The anndata object I had look like this

[AnnData object with n_obs x n_vars = 8057 x 0 backed at '1_fragments.tsv.gz.h5ad'
     obs: 'n_fragment', 'frac_dup', 'frac_mito'
     uns: 'reference_sequences'
     obsm: 'fragment_paired',
 AnnData object with n_obs x n_vars = 3804 x 0 backed at '2_fragments.tsv.gz.h5ad'
     obs: 'n_fragment', 'frac_dup', 'frac_mito'
     uns: 'reference_sequences'
     obsm: 'fragment_paired',
 AnnData object with n_obs x n_vars = 811 x 0 backed at '3_fragments.tsv.gz.h5ad'
     obs: 'n_fragment', 'frac_dup', 'frac_mito'
     uns: 'reference_sequences'
     obsm: 'fragment_paired',
 AnnData object with n_obs x n_vars = 2368 x 0 backed at '4_fragments.tsv.gz.h5ad'
     obs: 'n_fragment', 'frac_dup', 'frac_mito'
     uns: 'reference_sequences'
     obsm: 'fragment_paired']

Thank you

anndata python atac anndataset • 276 views
ADD COMMENT

Login before adding your answer.

Traffic: 1410 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6