Question

Is the sequence quality good enough?

0

Entering edit mode

6.2 years ago

BioinfGuru ★ 2.0k

Hi all,

Please take a look at these sequence quality histograms from fastqc.

Sample 1:

Sample 2:

This is WGS data sequenced on illumina HISEQ4000. We intend to call snps and indels and possibly structural variants. In the future we may even use the data set for imputation.

I have 4 options and I'm not really experienced enough to make the call but I'd like some informed opinions

Perform another size selection step to narrow the spread in the library pool so the HiSeq4000 can accommodate without read2 quality dropping as it did in the first run. We have QC’ed the library following a second round of sizing and it does look much better in terms of suitability for the HiSeq4000. However, 10X do not recommend this due to the fear of losing diversity in the library.
Run the library again on the HiSeq4000 with adjusted loading to improve overall yield. The likelihood here is that the read2 issue will continue.
Run the library on the NextSeq500. This is an unknown but it is believed this could accommodate the size of the library better than the HiSeq4000. The data yield would be lower.
Just use the data as is - the sequencing quality is still quite good - maybe consider trimming, but how much should be trimmed?

Appreciate any impute from experienced eyes.

EDIT: I'm expecting around 20X coverage (150bp read length, paired end, 250M reads per sample (125M per fq), 3GB genome)

Thanks, Kenneth.

WGS HISEQ4000 QC • 2.1k views

ADD COMMENT • link 6.2 years ago by BioinfGuru ★ 2.0k

3

Entering edit mode

quality plots look according to expectations to me. I would be tempted to go ahead with it as it is

ADD REPLY • link 6.2 years ago by lieven.sterck 15k

0

Entering edit mode

exactly what i was thinking - it could be better, but it really isn't that bad.

ADD REPLY • link 6.2 years ago by BioinfGuru ★ 2.0k

3

Entering edit mode

Before thinking of spending more money/effort (sequencing) you should always go ahead and analyze data you have in hand. As others have said this is not looking too shabby.

ADD REPLY • link 6.2 years ago by GenoMax 145k

1

Entering edit mode

Yes, I've started the pipeline for the data, should have the bams tomorrow and the vcfs on friday. thank you.

ADD REPLY • link 6.2 years ago by BioinfGuru ★ 2.0k

1

Entering edit mode

How to add images to a Biostars post

ADD REPLY • link 6.2 years ago by WouterDeCoster 47k

0

Entering edit mode

Edited ... thank you

ADD REPLY • link 6.2 years ago by BioinfGuru ★ 2.0k

1

Entering edit mode

Sequencing depth will be another important consideration to do the things you mentioned (especially SNPs). If you have a lot of depth, then I think the quality looks fine.

ADD REPLY • link 6.2 years ago by goodez ▴ 640