Forward and reverse single cell data analysis
6 weeks ago
elb

Hi, I have to analyze published scRNA-Seq data. Data are Chromium 10X on NovaSeq6000 and: for each sample I have forward (R1) in lane1 and lane2 (and hence two files) and reverse (R2) in lane1 and lane2 (and hence two files). Totally I have 4 files for sample. I have .fastq files. I usually analyze 10X scRNA-Seq data using Cellranger in single strand starting from the demultiplexing and ending with the alignment. It is the first time I have forward and reverse fastq files on multiple lanes and I don't know how to analyse them. Can anyone give me some hints on the analysis I have to do? Suggestions on tutorials, tools, papers about my task are more than welcome.

Are you still planning to use cellranger?

No no it is the practice of the lab but it is not mandatory

6 weeks ago
ATpoint

Just use cat to merge the R1s and the R2s respectively.

cat R1_lane1.fastq.gz R1_lane2.fastq.gz > R1.fastq.gz

same for R2.

See answer from swbarnes2 below, apparently CellRanger is picky about file names. I am not a CellRanger user, I prefer lightweight approaches such as Alevin for scRNA-seq quantification and for this (and probably most other applications) cating is fine. A CellRanger-like approach that is faster and more generic (=less picky) would be STARsolo.

I think you meant "zcat"

cat works as well.

No, zcat would decompress the stream which is unintended. cat is well able to concatenate two gzip-compressed files producing a single gzip-compressed file.

 echo 123 | gzip > foo.txt.gz
echo 456 | gzip > bar.txt.gz
cat foo.txt.gz bar.txt.gz | gzip -dc
123
456

That's great. Good to know and thanks for the tip.

That will not work. Cellranger is fussy about file names, it won't read that file in.

#til -- another reason to use Alevin over CellRanger :-P

6 weeks ago
swbarnes2

Don't cat the files. Cellranger is smart, it will get files with the same sample name from all lanes.