Question

Question about scRNA-seq analysis

0

Entering edit mode

5.3 years ago

tujuchuanli ▴ 130

Hi,

I am interested in the scRNA-seq data from a previous study (http://www.ncbi.nlm.nih.gov/pubmed/31067475). They deposited the dataset under the accession “GSE134520” in GEO dataset. Since they didn`t provide the raw data generated from cellranger, I re-run the cellranger pipeline and do the subsequent process largely following the code provided by the author (http://bioinfo.au.tsinghua.edu.cn/member/pzhang/scstomach.r). Here I have a few questions:

In the paper, author claimed that they collected 55,440 cells in total and finally got 32,332 cells after filtration. However, I got more than 90,000 cells in total after reading cellranger output file under the “filtered_feature_bc_matrix” folder. Why do I get proximately twice number of cells? Furthermore, I did the filtration step and found that there are about 22,000 cells which were passed the filtration.
In the code (line 41 and 42), the author wanted to get rid of batch effects. They subseted the datasets to exclude the sample “CAN1” using “SubsetData” function and do the subsequent analysis without adding it back. Why do they exclude sample “CAN1” and not add it back?

Thanks in advance.

scRNA-seq • 1.3k views

ADD COMMENT • link 5.3 years ago by tujuchuanli ▴ 130

0

Entering edit mode

I think many of these questions would be more readily answered by the paper's authors.

1) If there are any small differences in your pipelines this type of result is probably typical. It could be that there were 90k cell barcodes but only 55k were "real" cells. I would try to compare (if available) the "good" cell barcodes they use and see if they are the majority of "good" cell barcodes you found (~22k).

2) They likely noticed something in their data that led to the exclusion of that subset. If it is not mentioned in their methods then the best way would be to ask the authors directly.

ADD REPLY • link 5.3 years ago by benformatics 4.1k

0

Entering edit mode

Thanks, benformatics

Actually, I have written a letter to the author and still not have response. However, if they noticed something and excluded it from subsequent analysis, there should be 12 samples instead of 13 samples which were the sample number mentioned in the paper.

ADD REPLY • link 5.3 years ago by tujuchuanli ▴ 130