Comparing datasets from two different library methods
0
0
Entering edit mode
4.0 years ago
bazok ▴ 20

Hi All, Please I need your input on how best to go about an rnaseq analysis I am currently working on as I couldnt find any closely related post. I have 5 datasets (4 with UMI counts and 1 with FPKM) to compare. I am taking the z-score of all the dataset separately before passing on to Seurat..

My questions are : - Is this a right direction or there is a better way around? -If it is the right approach, is there a need to do any normalization/log transformation/what normalization approach would be the best before merging or how best can one preprocess the datasets to be able get any valuable insight from the analysis? - Is it possible to convert UMI to FPKM and then follow the Seurat Multiple Dataset Integration guide to go by the comparison?

Thanks

rna-seq R next-gen • 1.2k views
ADD COMMENT
0
Entering edit mode

I am taking the z-score of all the dataset separately before passing on to Seurat

Why not use the recommended workflow? Seurat is designed to work with UMI and FPKM data, not z-scores.

ADD REPLY
0
Entering edit mode

Thanks Igor. Since all the datasets are not in the same units, I thought taking the z-score first should form a basis for comparison(integration).

ADD REPLY
0
Entering edit mode

In the default workflow, Seurat will perform its own scaling.

ADD REPLY
0
Entering edit mode

Thanks alot Igor. I zoomed into how Seurat does this and I think it is like what i need. For the analysis (4 dataset in UMI and 1 in FPKM), I proceeded as in below

  • Read in the data and created Seurat Object
  • Normalized the 4 dataset with UMI count ( My understanding is that Seurat first normalizes for sequencing depth and then takes a log(e) i.e ln of the data. In this, I used scale.factor = 1000000
  • I took the log(e) of dataset with FPKM.(log(FPKM+1))

With the above, I started the normal data integration steps - FindVariableFeatures,FindIntegrationAnchors(I used "LogNormalize" vs "SCT" as normalization.method),IntegrateData,ScaleData,RunPCA etc.

Does this approach seem like the right one to compare the dataset in different units that I have?

Thanks alot

ADD REPLY
1
Entering edit mode

SCTransform is for UMI data.

The rest seems fine.

ADD REPLY

Login before adding your answer.

Traffic: 2668 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6