STARsolo parameters for 10X Multiome kit (scRNA-seq & scATAC-seq)
0
0
Entering edit mode
3.3 years ago
thanospas • 0

Hi everyone :)

I would like to ask a question regarding the use of STARsolo for counting reads for 2 libraries (10X Multiome kit). First, allow me to offer some background. I tried running the cellranger-arc (2.0.1) pipeline but unfortunately, no matter what I try, I cannot generate a reference for Xenopus laevis' genome. To make a long story short, CR-arc mkref (or CR-atac) doesn't like the formatting of the annotation file (While the vanilla CR v6.1.2 has no problem with it, GTF file source: Xenbase). 10X Genomics indirectly refuses to generate a reference for us and for this reason, I am trying to find an alternative to avoid abandoning plans to run the kit.

Could anyone please tell me if the following command makes sense for the Multiome kit or provide a correct one for STARsolo?

STAR --genomeDir genome_idx/  --readFilesIn s1read2.fq,s2read2.fq s1read1.fq,s2read1.fq --soloType CB_UMI_Simple --soloCBwhitelist /10X_barcode_whitelists/737K-arc-v1.txt.gz --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloUMIdedup 1MM_CR --soloCellFilter EmptyDrops_CR 3000 0.99 10 45000 90000 500 0.01 20000 0.01 10000 --soloCellReadStats Standard --clipAdapterType CellRanger4 --outFilterScoreMin 30 --outSAMtype BAM SortedByCoordinate --outSAMattributes CR UR CY UY CB UB --runThreadN 64 --soloFeatures GeneFull --outFileNamePrefix s8 --quantMode GeneCounts

There are a couple of things with this command that I am not sure of:

First, should I outline the GEX and ATAC fastq files with a specific order? I know that one should place the files for one sample as:

$R2 $R1

I have read the STAR manual on introducing multiple samples but there is no reference on the use of multi-modal read counting on the same sample, would sth like this make sense for example?

$GEX_R2, $ATAC_R2 $GEX_R1, ATAC_R1

Second, I am unsure of the --soloType CB_UMI_Simple parameter. 10X provides the 737K-arc-v1.txt.gz whitelist but in fact, this list comes in two variants, one for the GEX library and one for the ATAC, as described here. By opening up these 2 .txt files, one sees that the barcodes are different in reality. Therefore, this prompts me to use the option --soloType CB_UMI_Complex and if the is way to go, how should one input these 2 whitelists and in what order? For example, what about --soloCBwhitelist :

GEX_list, ATAC_list

Third, do you see any problem with --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 parameters? I am repurposing these settings from the 3' Chromium V3 chemistry, as some people assumed they are compatible.

Lastly, would it be maybe the correct way to go, to run the libraries separately and avoid the multi-sample/modal setup?

I am sorry for this long text.

Kind regards, Thanos

scRNA-seq scATAC-seq STARsolo 10X_Multiome Xenopus • 2.2k views
ADD COMMENT
0
Entering edit mode

Hi. I do not have answer for you. I would like to ask if you solved your problem? Because now I also want to try to mapping the FASTQ files generated from the Multiome using STARsolo because mapping via Cellranger ARC does not give me the high number of mapped genes

ADD REPLY
0
Entering edit mode

You may want to try and contact 10x technical support about cellranger-arc and your data. It is possible that you are doing something wrong in which case they can help you. If not, your data may not be up to mark (not what you want to hear) and using STARsolo or other tools will not fix that.

ADD REPLY
0
Entering edit mode

Thank you so much!!! I solved my problem

ADD REPLY

Login before adding your answer.

Traffic: 3630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6