Hello,
I have single cell and bulk RNA seq data for both of which I have performed some basic analysis. For the bulk RNA seq data I have performed DESe2 and I have gotten a list of DE genes. I would like to make a UMAP where the cells are colored by the average expression of the bulk signature genes but I am having trouble doing it. I am working with scanpy.
I have done the below so far:
bulk_de_genes_list = bulk_de_genes['Gene'].tolist()
# Filter the genes
adata2 = adata[:, adata.var_names.isin(bulk_de_genes_list)]
This seems to have worked but I am having issues with the next part. I am not sure if the average expression of the bulk signature genes would be obtained using average_expression = adata2.X.mean(axis=0) or cell_averages = adata2.X.mean(axis=1) so I have tried two things:
First:
average_expression = adata2.X.mean(axis=0)
# Divide the average expression into bins
bins = np.histogram(average_expression, bins='fd')[1]
# Assign a color to each bin
cmap = plt.get_cmap('viridis')
colors = cmap(np.digitize(average_expression, bins) / len(bins))
# Run UMAP
sc.pp.neighbors(adata2, n_neighbors=10)
sc.tl.umap(adata2)
fig, ax = plt.subplots()
sc.pl.umap(adata2, color=colors, cmap='viridis')
plt.show()
This gives me the errors below:
TypeError: unhashable type: 'numpy.ndarray'
ValueError: Image size of 1932x155200 pixels is too large. It must be less than 2^16 in each direction
. Second attempt:
# Calculate the average expression of each signature gene for each cell
cell_averages = adata2.X.mean(axis=1)
# Add the average expression of the bulk signature genes as a new variable
# to the AnnData object
adata2.obs['bulk_de_gene_average'] = cell_averages
# Plot the UMAP and color the cells based on the average expression of the bulk
# signature genes
sc.pl.umap(adata2, color='bulk_de_gene_average', cmap='viridis')
This produces the UMAP but I am not sure if it is correct. Thank you for the help
Edit: It seems that the second way is correct
Thank you. I updated the question because I am trying to find a way to do this in scanpy and I find manipulating the object a bit confusing.