Issues importing STARsolo's output into Seurat
1
0
Entering edit mode
9 weeks ago
MYousry ▴ 20

Hello everyone,

I have issues importing the filtered matrix files of STARsolo output to use with Seurat.

I have tried multiple ways like:

Drosophila.data <- ReadMtx(mtx ="~/genome/matrix/matrix.mtx", cells="~/genome/matrix/barcodes.tsv", features="~/genome/matrix/features.tsv")

and

Drosophila.data <- ReadSTARsolo(data.dir ="~/genome/matrix/)

But both give me the following error:

Error: Matrix has 13968 rows but found 12507 features. 
In addition: Warning messages:
1: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  EOF within quoted string
2: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  number of items read is not a multiple of the number of columns

The features file looks ok to me as it has exactly 13968 rows.

I am not sure how to tackle this problem and even tried rerunning STARsolo but that did not help. Any help would be appreciated!

Seurat STARsolo • 472 views
ADD COMMENT
1
Entering edit mode
9 weeks ago

Hard to tell exactly what's causing the problem without having the data in-hand, but I would first try setting feature.column=1 since the second column could have duplicate gene names and I'm not sure how they would be handled by Seurat's import function.

If that doesn't work you might want to try loading the data manually to see if it's even possible.

library("Matrix")
library("readr")

mtx <- readMM("~/genome/matrix/matrix.mtx")
rownames(mtx) <- read_tsv("~/genome/matrix/features.tsv", col_names=FALSE)[, 1, drop=TRUE]
colnames(mtx) <- read_tsv("~/genome/matrix/barcodes.tsv", col_names=FALSE)[, 1, drop=TRUE]

seu <- CreateSeuratObject(counts=mtx)
ADD COMMENT
0
Entering edit mode

I tried setting feature.column=1 before but unfortunately it didn't work. I will try loading the data manually as you suggested. Here is the data: https://drive.google.com/drive/folders/1xeWXlqRsHsx8ayn6wF-9GHV1N8bzvJMT?usp=sharing Thank you so much!

ADD REPLY
0
Entering edit mode

Hi Again, I tried the second solution and here is what I got: (it seems to work for me but I am not sure tbh since I am a beginner, so please let me know if it looks ok)

> library(Seurat)
Attaching SeuratObject
Attaching sp
> library(patchwork)
> library("Matrix")
> library("readr")

> mtx <- readMM("~/genome/matrix/matrix.mtx")

> rownames(mtx) <- read_tsv("~/genome/matrix/features.tsv", col_names=FALSE)[, 1, drop=TRUE]
Rows: 13968 Columns: 3                                                                                             
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (3): X1, X2, X3

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

> colnames(mtx) <- read_tsv("~/genome/matrix/barcodes.tsv", col_names=FALSE)[, 1, drop=TRUE]

Rows: 9042 Columns: 1                                                                                              
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (1): X1

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

> seu <- CreateSeuratObject(counts=mtx)

> seu

An object of class Seurat 
13968 features across 9042 samples within 1 assay 
Active assay: RNA (13968 features, 0 variable features)
ADD REPLY
1
Entering edit mode

It does look like the second method worked. The Seurat object should be fine for downstream analysis.

ADD REPLY
0
Entering edit mode

Thank you so much!

ADD REPLY

Login before adding your answer.

Traffic: 666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6