Question: How to align 10X R1 and R2 fastqs?
0
gravatar for sambunga094
7 months ago by
sambunga0940 wrote:

Hi guys, i am new to bioinformatics. Ill just keep my question simple. I have downloaded a fastq single cell dataset from 10X. It has R1 and R2 of lane 1 and lane 2. How can i align them and make it to one fastq file?

Example:

Input:

neurons_mouse_L001_R1.fastq

neurons_mouse_L002_R1.fastq

neurons_mouse_L001_R2.fastq

neurons_mouse_L002_R2.fastq

Expected output:

neurons_mouse.fastq

ADD COMMENTlink modified 7 months ago by Zhixue10 • written 7 months ago by sambunga0940

What kind of 10x data is this?

ADD REPLYlink written 7 months ago by genomax78k

This is mouse brain single cell data. https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/neuron_1k_v2

ADD REPLYlink written 7 months ago by sambunga0940

28bp read1 (16bp Chromium barcode and 12bp UMI) = R1 , 91bp read2 (transcript) = R2 , and 8bp I7 sample barcode

What are you planning to do next? Same sample ran on two lanes so the files can be concatenated.

ADD REPLYlink modified 7 months ago • written 7 months ago by genomax78k

I'm planning to build a matirx output so i can analyze it through seurat. Can i do cat of R1 and R2? Can you please explain me.

Thank you so much!

ADD REPLYlink written 7 months ago by sambunga0940

The link you posted above also has analyzed data so unless you want to recreate the analysis you could just get the result files from there directly.

ADD REPLYlink written 7 months ago by genomax78k

Please first get a background in (sc)RNA-seq analysis by reading tutorials, e.g.

http://research.fhcrc.org/content/dam/stripe/sun/software/scRNAseq/scRNAseq.html

https://bioconductor.org/packages/devel/workflows/vignettes/simpleSingleCell/inst/doc/intro.html

https://davetang.org/muse/2018/08/09/getting-started-with-cell-ranger/

There are many more available, just search the web.

Standard tools for alignment or quantifications could be CellRanger or Alevin, the latter being a more recent development that relies on the quantification strategy of salmon plus a couple of other cool features (see the docs) There is extensive documentation available for both.

ADD REPLYlink modified 7 months ago • written 7 months ago by ATpoint30k

You can align paired-end data directly (two lanes or merge into one if possible accorrding to your knowledge) or transform it to single-end one (might lose paired imformation) like

@read1/1
xxx
xxx
xxx
......
@read1/2
xxx
xxx
xxx

use the command like

zcat sample_1.fq.gz | awk '{if(NR%4==1) print $0"/1"; else print $0}' > sample_onefile.fq
zcat sample_2.fq.gz | awk '{if(NR%4==1) print $0"/2"; else print $0}' >> sample_onefile.fq

If it is single-end, the order of reads is not important.

ADD REPLYlink modified 7 months ago • written 7 months ago by Zhixue10

I will move this to a comment as it does not really answer the 10X-related question. In 10X there is no way you should merge paired files into one as R1 contains the barcode/Umi information that is processed separately from the R2 (cDNA) sequence. If you want to merge files (one calls this interleaved fastq better use something like seqtk which does it more conveniently. You also should not append /1 or /2 as (to my knowledge) most aligners expect identical read names for the two mates.

ADD REPLYlink written 7 months ago by ATpoint30k
0
gravatar for swbarnes2
7 months ago by
swbarnes27.5k
United States
swbarnes27.5k wrote:

I have downloaded a fastq single cell dataset from 10X. It has R1 and R2 of lane 1 and lane 2. How can i align them and make it to one fastq file?

You don't make one fastq file. You can and should concatenate the data from two lanes into one R1 and one R2 file, but cellranger takes both of them separately as input. Did you look at the guides for using cellranger on the 10XGenomics website?

ADD COMMENTlink modified 7 months ago • written 7 months ago by swbarnes27.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 701 users visited in the last hour