Entering edit mode
4 months ago
1769mkc
★
1.3k
## PBMC data from signac package
counts <- Read10X_h5("pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5")
# Inspect
print(counts)
Genome matrix has multiple modalities, returning a list of matrices for this genome
$`Gene Expression`
36601 x 11909 sparse Matrix of class "dgCMatrix"
[[ suppressing 32 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]
[[ suppressing 32 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]
MIR1302-2HG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
$Peaks
108377 x 11909 sparse Matrix of class "dgCMatrix"
[[ suppressing 31 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]
[[ suppressing 31 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]
chr1:10109-10357 . . . . . . . . . . . . . . . . . . . . . . . . . .
chr1:180730-181630 . . . . . . . . . . . . . . . . . . . . . . . . . .
## Same data in rds format
seurat_obj <- CreateSeuratObject(counts = counts, project = "PBMC_Granulocyte")
saveRDS(counts, file = "genepbmc_granulocyte_counts.rds")
counts_rds <- readRDS("genepbmc_granulocyte_counts.rds")
### create the combined object
pbmc <- CreateSeuratObject(
counts = counts_rds$`Gene Expression`,
assay = "RNA"
)
if ("Peaks" %in% names(counts_rds)) {
pbmc[["ATAC"]] <- CreateChromatinAssay(
counts = counts_rds$Peaks,
fragments = "pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz",
sep = c(":", "-")
)
}
Assays(pbmc)
'Computing hash'
'RNA''ATAC'
Assays(pbmc)
DefaultAssay(pbmc)
'RNA''ATAC'
'RNA'
## The above process was repeated for the cellxgene data
## Cellxgene_data : snRNA-seq data from eight focal cortical dysplasia donors
fragpath <- "c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz"
rna_seurat <- readRDS("e16e7bb1-6611-46f8-812c-10f3b54ed167.rds")
# check object
An object of class Seurat
36406 features across 61525 samples within 1 assay
Active assay: RNA (36406 features, 0 variable features)
2 layers present: counts, data
4 dimensional reductions calculated: X_integrated.rna.harmony, X_pca, X_umap.joint, X_umap.rna.harmony
## Layers inside the cellxgene is only expression
Assays(rna_seurat)
DefaultAssay(rna_seurat)
'RNA'
'RNA'
## The difference in the assay slots are these : in case of PBMC that is already part of the .h5 file which contains both the layers
So when we acess the data layers we do see RNA and ATAC , since for ATAC layer in case of PBMC we already have the
counts = counts_rds$Peaks
Now the actual cellxgene data doesn't have the structure as we can see from the assay
rna_seurat@assays$RNA
For integration we need to check this part again since the ATAC layer not added
I don't find any associated .h5 file which was given in the paper to generate the data for ATAC as shown in the signac documentation, for ATAC
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_fragments.tsv.gz
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_fragments.tsv.gz.tbi
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_peaks.bed
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_singlecell.csv
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_filtered_peak_bc_matrix.h5
What we get from cellxgene is these
https://datasets.cellxgene.cziscience.com/c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz
https://datasets.cellxgene.cziscience.com/c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz.tbi
https://datasets.cellxgene.cziscience.com/e16e7bb1-6611-46f8-812c-10f3b54ed167.h5ad
So is there a way we can add the ATAC layer from the fragment data that is available with cellxgene?
The actual paper
Would you mind formatting your wall of code with markdown? thank you.
Sorry for that, it is done by our moderator. I would keep that in mind
You can call peaks again from the fragments file using Macs2 like they did in the paper. Then you can add your
peaks x cells
matrix as a layer in your Seurat object (ATAC).Naive question So that is the only information I would need for mac2 input?
yes, you can try this
thank you for the resources, will try and update