Single cell expression and accessibility integration using cellxgene data
0
0
Entering edit mode
4 months ago
1769mkc ★ 1.3k
## PBMC data from signac package

counts <- Read10X_h5("pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5")

# Inspect 
print(counts)

Genome matrix has multiple modalities, returning a list of matrices for this genome

$`Gene Expression`
36601 x 11909 sparse Matrix of class "dgCMatrix"
  [[ suppressing 32 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]

  [[ suppressing 32 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]


MIR1302-2HG   .  .  . .  . .  .  . . . . . .  .  .  .  .  .  .  .  . . .  .  .  .  .  .  . .


$Peaks
108377 x 11909 sparse Matrix of class "dgCMatrix"
  [[ suppressing 31 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]

  [[ suppressing 31 column names 'AAACAGCCAAGGAATC-1', 'AAACAGCCAATCCCTT-1', 'AAACAGCCAATGCGCT-1' ... ]]


chr1:10109-10357        . .  . . . .  .  .  .  .  .  . . .  .  .  .  .  . .  . .  .  .  .  .
chr1:180730-181630      . .  . . . .  .  .  .  .  .  . . .  .  .  .  .  . .  . .  .  .  .  .


## Same data in rds format 

seurat_obj <- CreateSeuratObject(counts = counts, project = "PBMC_Granulocyte")

saveRDS(counts, file = "genepbmc_granulocyte_counts.rds")

counts_rds <- readRDS("genepbmc_granulocyte_counts.rds")

### create the combined object

pbmc <- CreateSeuratObject(
  counts = counts_rds$`Gene Expression`, 
  assay = "RNA"
)

if ("Peaks" %in% names(counts_rds)) {
  pbmc[["ATAC"]] <- CreateChromatinAssay(
    counts = counts_rds$Peaks,
    fragments = "pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz", 
    sep = c(":", "-")
  )
}

Assays(pbmc)  
'Computing hash'

'RNA''ATAC'

Assays(pbmc)
DefaultAssay(pbmc)

'RNA''ATAC'
'RNA'


## The above process was repeated for the cellxgene data 

## Cellxgene_data : snRNA-seq data from eight focal cortical dysplasia donors

fragpath <- "c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz"
rna_seurat <- readRDS("e16e7bb1-6611-46f8-812c-10f3b54ed167.rds")

# check object

An object of class Seurat 
36406 features across 61525 samples within 1 assay 
Active assay: RNA (36406 features, 0 variable features)
2 layers present: counts, data
4 dimensional reductions calculated: X_integrated.rna.harmony, X_pca, X_umap.joint, X_umap.rna.harmony

## Layers inside the cellxgene is only expression 

Assays(rna_seurat)
DefaultAssay(rna_seurat)
'RNA'
'RNA'

## The difference in the assay slots are these : in case of PBMC that is already part of the .h5 file which contains both the layers

So when we acess the data layers we do see RNA and ATAC , since for ATAC layer in case of PBMC we already have the

counts = counts_rds$Peaks

Now the actual cellxgene data doesn't have the structure as we can see from the assay

rna_seurat@assays$RNA

For integration we need to check this part again since the ATAC layer not added

I don't find any associated .h5 file which was given in the paper to generate the data for ATAC as shown in the signac documentation, for ATAC

wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_fragments.tsv.gz
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_fragments.tsv.gz.tbi
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_peaks.bed
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_singlecell.csv
wget https://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_pbmc_500_nextgem/atac_pbmc_500_nextgem_filtered_peak_bc_matrix.h5

What we get from cellxgene is these

https://datasets.cellxgene.cziscience.com/c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz
https://datasets.cellxgene.cziscience.com/c64a1617-c83f-4bc4-a3ce-f63255f12fa8-fragment.tsv.bgz.tbi
https://datasets.cellxgene.cziscience.com/e16e7bb1-6611-46f8-812c-10f3b54ed167.h5ad

So is there a way we can add the ATAC layer from the fragment data that is available with cellxgene?

The actual paper

cellxgene • 913 views
ADD COMMENT
2
Entering edit mode

Would you mind formatting your wall of code with markdown? thank you.

ADD REPLY
0
Entering edit mode

Sorry for that, it is done by our moderator. I would keep that in mind

ADD REPLY
0
Entering edit mode

You can call peaks again from the fragments file using Macs2 like they did in the paper. Then you can add your peaks x cells matrix as a layer in your Seurat object (ATAC).

ADD REPLY
0
Entering edit mode

Naive question So that is the only information I would need for mac2 input?

ADD REPLY
1
Entering edit mode

yes, you can try this

ADD REPLY
0
Entering edit mode

thank you for the resources, will try and update

ADD REPLY

Login before adding your answer.

Traffic: 3379 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6