Question

bcl2fastq results in Poly-N in R1

0

Entering edit mode

4.2 years ago

Assa Yeroslaviz ★ 1.8k

Hi, we're working on a scRNA-Seq samples, where R1 has 16bases with the cellular and molecular barcodes, while R2 is 150bases long and contains the genomic sequence.

This data set appears to be very problematic, as it shows many problems. We think that we have sequence into the adapter, as running fastqc shows an over-representation of PolyA stretches and a drop of quality as one can see in the first attachment. R2 polyA tail

For that reason we would probably hard trim the samples (maybe after qc-trimming)before the quality drop. But what we don't understand is why our R1 shows only stretches of N.

PolyN-Stretches

Does anyone has an explanation for this kind behavior? Has anyone seen something like this before? the bcl2fastq did not show any errors at all and the fastq is just a long list of read with ployN stretches.

thanks Assa

bcl2fastq paired-end RNA-Seq scRNA-Seq single-cell • 1.7k views

ADD COMMENT • link updated 4.2 years ago by GenoMax 141k • written 4.2 years ago by Assa Yeroslaviz ★ 1.8k

0

Entering edit mode

Do you have access to the library QC? (pre-sequencing)

ADD REPLY • link 4.2 years ago by Asaf 10k

0

Entering edit mode

yes, but the QC is good

ADD REPLY • link 4.2 years ago by Assa Yeroslaviz ★ 1.8k

score 3 · Accepted Answer · 2020-02-20

3

Entering edit mode

4.2 years ago

GenoMax 141k

You need to use option --mask-short-adapter-reads 0 with bcl2fastq to prevent the short first read from getting masked. Reads less than 25 bp are normally masked with N's.