Question: TCR analysis in Suerat
0
gravatar for roberts
27 days ago by
roberts10
roberts10 wrote:

Hello, I have successfully add my VDJ data to my GEX data by adding it as meta data. I would now like to distinguish between CD8 and CD4 TCR. I know by using WhichCell, I can pull out cells that express CD8 (T8 <- WhichCells(object, expression= CD8A >1)). I would then like to take that T8 data and make a .csv file with the clonotypes, aa sequence, nt sequence, etc. How would I go about doing that. Would I have to add it as metadata again? If so how would I go about that? Thank you!

seurat rna-seq tcr R 10x • 147 views
ADD COMMENTlink modified 27 days ago by jared.andrews077.5k • written 27 days ago by roberts10
1
gravatar for jared.andrews07
27 days ago by
jared.andrews077.5k
Memphis, TN
jared.andrews077.5k wrote:

Just save the metadata for your subset as a file then.

T8.meta <- T8[[]]
write.csv(T8.meta, file = "T8_meta.csv")
ADD COMMENTlink written 27 days ago by jared.andrews077.5k

Thank you for your help it is greatly appreciated. I still have some questions though. It seems that the command T8 <- WhichCells(object, expression= CD8A >1) only saves barcodes. Is there a way to also get it to save TCR data (mostly clonotypes, aa sequence, and nt sequence). Also what is the function T8[[]] doing? Thank you again for your help!

ADD REPLYlink written 27 days ago by roberts10

T8 <- subset(object, subset = CD8A > 1) will subset your Seurat object as you want.

T8[[]] just returns the metadata as a dataframe.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Thank you! I ran your code and the error messages went away. The only thing is that in the new CSV file under clonotype_id it says none for every barcode. Also is there a way to add in the aa sequence and nt sequence that I got from the cellranger output from the VDJ library prep. Thank you for taking your time to help me greatly appreciated!

ADD REPLYlink written 26 days ago by roberts10

It sounds like you haven't actually added the clonotype data to your metadata then.

If you have already added it to your metadata, you can run the commands in my original answer to save the full metadata dataframe. It will have everything you added to metadata. Or you can select multiple columns in your T8[[]] call, e.g. T8.meta <- [[c("clonotype_id", "clonotype_aa", "clonotype_nt"]] if you don't want the extra columns.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Hello I have seemed to include the clonotype_id, aa sequence, and nt sequence. My isse now is that when I go to create a .csv I get this error "Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘structure("Seurat", package = "Seurat")’ to a data.frame". I cant seem to figure out a way to make a .csv?

ADD REPLYlink written 26 days ago by roberts10

It seems you are somehow trying to write the Seurat object to file rather than the metadata dateframe. Need the code used to have any shot of helping you.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Yes of course. Here is the full script-

AD007.data <- Read10X(data.dir = "/Users//Desktop/AD007")
AD007S <- CreateSeuratObject(counts=AD007.data, Project="CSF",min.cells = 2,
                             min.features = 200)


### adding VDJ data
AD007S_clonotype<- read.csv("/Volumes/Active/cellrangeroutput/AD007vdj/outs/filtered_contig_annotations.csv")
AD007S_clonotype<- AD007S_clonotype[,c("barcode","raw_clonotype_id")]
AD007S_clonotype <- AD007S_clonotype[!duplicated(AD007S_clonotype$barcode),]
rownames(AD007S_clonotype)<- AD007S_clonotype[,1]
AD007S$clonotype_id<-names(AD007S$clonotype_id)
names(AD007S_clonotype)[names(AD007S_clonotype) == "raw_clonotype_id"] <- "clonotype_id"
AD007S$clonotype_id<- "none"
View(AD007S_clonotype)
AD007S <- AddMetaData(object = AD007S, metadata = T8)
AD007S$clonotype_id
n_distinct(AD007S$clonotype_id)
AD007S$clonotype_id <- as.factor(AD007S$clonotype_id)
str(AD007S@meta.data$clonotype_id)

### processesing/normalization
AD007S[["percent.mt"]] <- PercentageFeatureSet(AD007S, pattern = "^MT-")
AD007S <-subset(AD007, subset= nFeature_RNA>200 & nFeature_RNA <6000 & nCount_RNA >100 & nCount_RNA < 20000 & percent.mt <10)

AD007S <- NormalizeData(AD007)
AD007S <- FindVariableFeatures(AD007S, selection.method = "vst", nfeatures = 2000)
top10 <-head(VariableFeatures(AD007S),10)
all.genes <- rownames(AD007S)
AD007S <- ScaleData(AD007S, features = all.genes)
unwanted_genes <- grep(pattern = "^HLA*|^IGHV*|^IGHJ*|^IGHD*|^IGKV*|^IGLV*|^TRBV*|^TRBD*|^TRBJ*|^TRDV*|^TRDD*|^TRDJ*|^TRAV*|^TRAJ*|^TRGV*|^TRGJ*", x = AD007S@assays$RNA@var.features, value = T)
remove_genes <- AD007S@assays$RNA@var.features %in% unwanted_genes
AD007S@assays$RNA@var.features = AD007S@assays$RNA@var.features[!remove_genes]

AD007S <- RunPCA(AD007, features = VariableFeatures(object = AD007S))
AD007S <- JackStraw(AD007S, num.replicate = 100)
AD007S <- ScoreJackStraw(AD007S, dims = 1:20)
ElbowPlot(AD007S)
AD007S <- FindNeighbors(AD007S, dims = 1:16)
AD007S <- FindClusters(AD007S, resolution = 0.5)
AD007S <- RunUMAP(AD007S, dims = 1:16)
DimPlot(AD007S, reduction = "umap")
AD007S.markers <- FindAllMarkers(AD007S, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
AD007S.markers %>% group_by(cluster) %>% top_n(n = 2, wt = avg_logFC)
FeaturePlot(AD007S, features = c("CD3G","CD3D", "CD3E", "CD4", "CD8A", "TNF", "IL22", "PTPRC", "CD45RABC",
                                "AIF1", "TLR7", "CD19", "GZMB", "TLR9"))
top10 <- AD007S.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)
DoHeatmap(AD007S, features = top10$gene) + NoLegend()


### pulling out CD8 cells
T8 <- WhichCells(object= AD007S, expression= CD8A >1)
T8 <- subset(object, subset = CD8A > 1)
write.csv(T8.meta, file = "T8_meta.csv")

--

I also just realized that the aa sequence and the nt sequence is not at all in my data opps! Thank you again for taking time to help me!

ADD REPLYlink modified 26 days ago by jared.andrews077.5k • written 26 days ago by roberts10

Please use the code formatting button (the one with 1's and 0's) in the future. I have done it for you this time.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Why did you think your last few lines would work? You never define the T8.meta object, and you were not meant to just use object in your subset call, but your Seurat object, the name of which I had no way of knowing. You also no longer need the WhichCells call since you're using subset instead.

So your last 3 lines should be:

T8 <- subset(object = AD0075, subset = CD8A > 1)
T8.meta <- AD0075[[]]
write.csv(T8.meta, file = "T8_meta.csv")

In the future, please place all relevant code in your opening post so folks don't have to continually ask for more information.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Hello, Thank you very much for all your help, you got it to work! Also sorry for all the confusion/hassle if you couldn't tell I am new to this and you've been a great help!

ADD REPLYlink written 26 days ago by roberts10

Glad to hear it. If this answers your question, consider accepting the answer by clicking the checkmark by the answer.

ADD REPLYlink written 26 days ago by jared.andrews077.5k

Hello again, when I plug in say T8.meta <- T8[["clonotype_id"]] i get this error message "subscript out of bounds" and that seems to be the same for everything I plug into the brackets

ADD REPLYlink written 27 days ago by roberts10

This occurs because T8 is just a vector of cell barcodes rather than a subset of your Seurat object. Running the code in my previous comment before doing this will fix it.

ADD REPLYlink written 26 days ago by jared.andrews077.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1757 users visited in the last hour