Working Monocle 3 with series_matrix files?
0
0
Entering edit mode
2.3 years ago
Pratik ▴ 850

Hi,

I am running Ubuntu 20.04 LTS. Currently on a slower Macbook Air, but recently ordered this server: HP Proliant DL360p G8 8 Bays 2.5 Server - 2X Intel Xeon E5-2680 2.7GHz 8 Core - 16GB DDR3 REG Memory - HP P420i 512MB Raid Controller - 2.4TB (4X 600GB 10K SAS SED New HDD) - 2X 750w PSU (Renewed) to work faster.

so I'm just starting out using Monocle 3. I want to eventually be able to use all the tools that are available efficiently, however I am starting with Monocle 3 because of the option to do pseudotime trajectory analysis.

I want to recreate the finding on the original Monocle3 paper: "The single-cell transcriptional landscape of mammalian organogenesis." Specifically the "Resolving cellular trajectories in myogenesis" figure.

I was going to be beginning with fastq files that through some real struggle I figured out how to download in bulk through the 'awk' command.

However, I was told by a mentor that working with expression matrix files would make my life easier.

So my questions are on NIH NCBI GEO Accession page is the "series_matrix.txt.gz" file also known as the expression matrix file?

In the 'loading the data step' in Getting started in Moncole3 on the Monocle3 page

# Load the data
expression_matrix <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_expression.rds"))
cell_metadata <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_colData.rds"))
gene_annotation <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_rowData.rds"))

Would I first download these series matrix file:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/matrix/GSE119945_series_matrix.txt.gz

then would cell_annotation be cell_metadata?:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/suppl/GSE119945%5Fcell%5Fannotate%2Ecsv%2Egz

and lastly (this one is more obvious, I think) gene_annotation would be gene_annotate:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/suppl/GSE119945%5Fgene%5Fannotate%2Ecsv%2Egz

so I would download these files through wget, then extract them through

gzip -d filename

and then feed their directories into?

expression_matrix <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_expression.rds"))
cell_metadata <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_colData.rds"))
gene_annotation <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_rowData.rds"))

and then I would continue the steps of getting started on the Monocle 3 page.

Could someone share what you do, when you're getting started with analyzing data with Monocle 3 without 10x genomic data, please?

Very Respectfully, Pratik

RNA-Seq rna-seq • 621 views
ADD COMMENT

Login before adding your answer.

Traffic: 1328 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6