Entering edit mode
4.0 years ago
Kevin Blighe
88k
Instigated due to another question: RNA-Seq Cell Barcode Whitelist 10X
I am adding this for the benefit of others, as there is no other resource where the following information is clearly stated, from what I have found.
These are useful for STARsolo
parameter configurations when re-aligning 10X Chromium FASTQs.
10x v1
- Whitelist, 737K-april-2014_rc.txt
- CB length, 14
- UMI start, 15
- UMI length, 10 (courtesy ATpoint)
10X v2
- Whitelist, 737K-august-2016.txt
- CB length, 16
- UMI start, 17
- UMI length, 10
10x v3
- Whitelist, 3M-Feb_2018_V3.txt
- CB length, 16
- UMI start, 17
- UMI length, 12
As per ATpoint, whitelists are available from: https://github.com/10XGenomics/cellranger/tree/master/lib/python/cellranger/barcodes
These are implemented in STAR as:
STAR \
...
--soloCBwhitelist [whitelist] \
--soloCBlen [CB length] \
--soloUMIstart [UMI start] \
--soloUMIlen [UMI length] \
...
Technically, STARsolo
can also be run with --soloCBwhitelist None
if no whitelist is provided.
Kevin
Here is a link to the old v1 chemistry datasheet. If I get that correctly the "UMI" as it is called today was 10bp and called a "randomer" back in the day.
Old post but I'm hoping someone can help. I was sent some 10X v2 data. The FASTQ for Read 1 is a full 150 bp. so STARsolo tells me the barcode sequence is too long. The read 1 all look like this:
You will need to trim this read down to suitable length if STARsolo does not like 150 bp read. As you see, remainder of the read is just polyA tail. You can hardtrim reads using
reformat.sh
from BBMap or any other trimming tool.I got data for 10X processed in Nextseq. It uses Chromium Next GEM Single Cell 3' GEM, Library Kit v3.1 but has 27 bases in R1 reads- (CCTTTCAGTCGCATCGGAACCCACTGC) White list (Whitelist, 3M-Feb_2018_V3.txt) AAACCCAAGAAACACT I also tried version 2 and without Whitelist, but still, it will not work. Any suggestion that what may be wrong in barcode specifications and read length?
Looks like your sequencing facility sequenced R1 one base pair-short. I guess you can try specifying
--soloBarcodeReadLength 27
.Unfortunately https://singlecell.usegalaxy.eu/ it will not let me fine tune such parameters.
Looks like you will have to run this on the command line in that case.
Kevin Blighe would you be able to help me with what whitelist to use for Chromium 5’ Next GEM single cell kit from 10X? I can't find information for this assay on what to use for processing.
Barcode lists are included in the
cellranger
(LINK) software package, which you can download for free (you will need to provide an email address). Specific file names and their location inside the software package are noted here: https://kb.10xgenomics.com/hc/en-us/articles/115004506263-What-is-a-barcode-whitelist