Question

Opinion on FastQC output for HiSeq 4000 PE sequencing run

0

Entering edit mode

5.7 years ago

quokka ▴ 10

Hello,

I recently had four 400bp insert plant DNA libraries sequenced (2x150bp) using one HiSeq 4000 lane.

I've attached FastQC outputs for R1 and R2 of one of these libraries.

It seems like their are some issues with low-flow sections on the cell(?). R2 reads are noticeably lower quality than the R1 reads for all libraries.

~94% of reads from the flow cell remain after deduplication with bbmap (clumpify).

~62% of deduplicated reads remain after quality triming with trimmomatic (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:100).

There seems to be some remaining adapter in some of the library reads.

Overall I've ended up with about 235,011,160 read pairs from the whole lane after deduplicating and trimming.

Other info: library preparation was PCR-free from physically sheared DNA; Inserts were sized by gel purification; Libraries were dual indexed.

My questions are:

Is this quality/quantity typical from a commercial provider using this platform?
Is the sequence bias observed at the 5' end of the reads observed in these libraries typical for a PCR-free library generated from physically sheared DNA?

Any additional comments appreciated.

Thanks in advance

R1

R2

next-gen HiSeq 4000 PCR-free Illumina bias • 1.8k views

ADD COMMENT • link 5.7 years ago by quokka ▴ 10

score 3 · Answer 1 · 2018-07-30

3

Entering edit mode

5.7 years ago

igor 13k

Based on FastQC, these libraries look fine. These are long reads, so it's normal to notice a quality dropoff toward the end of R2.

I think you will find these posts very helpful:

ADD COMMENT • link 5.7 years ago by igor 13k

0

Entering edit mode

Ok. Thanks Igor.

My first time with 150bp reads (and HiSeq 4000) so its good to get an idea on whats normal. Appreciate your insights.

I was a bit curious about the sequence bias at the 5' end because our library preparation didn't involve the use of transposases or random priming - nevertheless, I guess this could be because shearing and adapter ligation are somewhat sequence dependent.

ADD REPLY • link 5.7 years ago by quokka ▴ 10