Issues importing STARsolo's output into Seurat
1
0
Entering edit mode
9 weeks ago
MYousry ▴ 20

Hello everyone,

I have issues importing the filtered matrix files of STARsolo output to use with Seurat.

I have tried multiple ways like:

Drosophila.data <- ReadMtx(mtx ="~/genome/matrix/matrix.mtx", cells="~/genome/matrix/barcodes.tsv", features="~/genome/matrix/features.tsv")


and

Drosophila.data <- ReadSTARsolo(data.dir ="~/genome/matrix/)


But both give me the following error:

Error: Matrix has 13968 rows but found 12507 features.
1: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
EOF within quoted string
2: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
number of items read is not a multiple of the number of columns


The features file looks ok to me as it has exactly 13968 rows.

I am not sure how to tackle this problem and even tried rerunning STARsolo but that did not help. Any help would be appreciated!

Seurat STARsolo • 472 views
1
Entering edit mode
9 weeks ago

Hard to tell exactly what's causing the problem without having the data in-hand, but I would first try setting feature.column=1 since the second column could have duplicate gene names and I'm not sure how they would be handled by Seurat's import function.

If that doesn't work you might want to try loading the data manually to see if it's even possible.

library("Matrix")

rownames(mtx) <- read_tsv("~/genome/matrix/features.tsv", col_names=FALSE)[, 1, drop=TRUE]
colnames(mtx) <- read_tsv("~/genome/matrix/barcodes.tsv", col_names=FALSE)[, 1, drop=TRUE]

seu <- CreateSeuratObject(counts=mtx)

0
Entering edit mode

I tried setting feature.column=1 before but unfortunately it didn't work. I will try loading the data manually as you suggested. Here is the data: https://drive.google.com/drive/folders/1xeWXlqRsHsx8ayn6wF-9GHV1N8bzvJMT?usp=sharing Thank you so much!

0
Entering edit mode

Hi Again, I tried the second solution and here is what I got: (it seems to work for me but I am not sure tbh since I am a beginner, so please let me know if it looks ok)

> library(Seurat)
Attaching SeuratObject
Attaching sp
> library(patchwork)
> library("Matrix")

> rownames(mtx) <- read_tsv("~/genome/matrix/features.tsv", col_names=FALSE)[, 1, drop=TRUE]
Rows: 13968 Columns: 3
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (3): X1, X2, X3

ℹ Use spec() to retrieve the full column specification for this data.
ℹ Specify the column types or set show_col_types = FALSE to quiet this message.

> colnames(mtx) <- read_tsv("~/genome/matrix/barcodes.tsv", col_names=FALSE)[, 1, drop=TRUE]

Rows: 9042 Columns: 1
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (1): X1

ℹ Use spec() to retrieve the full column specification for this data.
ℹ Specify the column types or set show_col_types = FALSE to quiet this message.

> seu <- CreateSeuratObject(counts=mtx)

> seu

An object of class Seurat
13968 features across 9042 samples within 1 assay
Active assay: RNA (13968 features, 0 variable features)

1
Entering edit mode

It does look like the second method worked. The Seurat object should be fine for downstream analysis.

0
Entering edit mode

Thank you so much!