Issues with Mixture file when using CIBERSORTx
0
1
Entering edit mode
4 months ago

Hi,

I am trying to run a deconvolution analysis of bulk-RNAseq samples using the LM22 signature matrix provided. I converted all ENSEMBL ID's to their Symbol, and removed NA and duplicated entries.

counts_salmon <- as.data.frame(txi$counts)

counts_salmon$symbol <- mapIds(org.Hs.eg.db,
                            keys = rownames(counts_salmon),
                            column = "SYMBOL",
                            keytype = "ENSEMBL")
counts_salmon <- counts_salmon  |>
  distinct(symbol, .keep_all = T) |>
  rownames_to_column(var = "ensbl") |>
  select(-ensbl) |>
  filter(!is.na(symbol)) |>
  column_to_rownames(var = "symbol")

counts_salmon <- na.omit(counts_salmon )

write.table(counts_salmon , file = 'output/counts_salmon.tsv', append = FALSE, sep = "\t", 
            row.names = TRUE, col.names = TRUE, quote = FALSE)

The output is a .tsv without double quotation:

Genes   rna_11  RNA_26  RNA_8   RNA_16  RNA_19  rna_47  RNA_3   RNA_24
TSPAN6  0   0   8   0   249.567 76.756  26.741  308.308
TNMD    0   0   0   0   0   0   38  0
DPM1    58.092  31.013  0   67  303.226 570.16  48.289  1078.792
SCYL3   39.036  42.86   0   0   27.801  146.749 7   414.861
C1orf112    14  1   0   0   38.234  91.923  87.89   165.261
FGR 0   47  0   1   25  69  0   158
...

I'm using this file as a input for my mixture file in CIBERSORTx, with the following parameteres:

[Options] perm: 1
[Options] verbose: TRUE
[Options] rmbatchBmode: TRUE
[Options] QN: FALSE
[Options] outdir: files/mam9823@med.cornell.edu/results/
[Options] label: Job11
=============CIBERSORTx Settings===============
Mixture file: files/mam9823@med.cornell.edu/counts_salmon.tsv 
Signature matrix file: files/common/LM22.update-gene-symbols.txt 
Number of permutations set to: 1 
Enable verbose output
Do B-mode batch correction
==================CIBERSORTx===================
All done.

However, I keep getting this error:

Error: $ operator is invalid for atomic vectors
In addition: Warning messages:
1: In CIBERSORTxFractions(sigmatrix = sigmatrix, mixture = mixture,  :
  22292 duplicated gene symbol(s) found in mixture file!
2: In mclapply(1:svn_itor, res, mc.cores = svn_itor) :
  all scheduled cores encountered errors in user code
Execution halted

Thanks a lot in advance for any help!

Deconvolution CIBERSORTx • 482 views
ADD COMMENT
0
Entering edit mode

I got exactly the same error message. So I am curious to see how other people solved this... Did you already contact the authors about this?

ADD REPLY
0
Entering edit mode

Hello! I met the same error too, but managed to find out how it happens. The "duplicated gene symbol(s)" in the error message is actually referring the first column (NOT row names) of your mixture file, which means it recognized your first column of expression data as row names (gene symbol) by mistake. This is the probably cause: when you're running "write.table" with R, the argument "row.names = TRUE" will generate a line (the REAL first column) WITHOUT column name. Because the REAL first column doesn't have a column name (the column name is blank or empty so the REAL first column is omitted), the error occurs. Here's my solution (It WORKS): mixture_file <- cbind(rownames(mixture_file),mixture_file) write.table(mixture_file, file = "mixture_file.txt", sep = "\t", row.names = FALSE, col.names = TRUE,quote=FALSE)

ADD REPLY

Login before adding your answer.

Traffic: 1244 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6