Question: Is the sequence quality good enough?
gravatar for YaGalbi
6 months ago by
Biocomputing, MRC Harwell Institute, Oxford, UK
YaGalbi1.4k wrote:

Hi all,

Please take a look at these sequence quality histograms from fastqc.

Sample 1: Sample 1

Sample 2:Sample 2

This is WGS data sequenced on illumina HISEQ4000. We intend to call snps and indels and possibly structural variants. In the future we may even use the data set for imputation.

I have 4 options and I'm not really experienced enough to make the call but I'd like some informed opinions

  1. Perform another size selection step to narrow the spread in the library pool so the HiSeq4000 can accommodate without read2 quality dropping as it did in the first run. We have QC’ed the library following a second round of sizing and it does look much better in terms of suitability for the HiSeq4000. However, 10X do not recommend this due to the fear of losing diversity in the library.
  2. Run the library again on the HiSeq4000 with adjusted loading to improve overall yield. The likelihood here is that the read2 issue will continue.
  3. Run the library on the NextSeq500. This is an unknown but it is believed this could accommodate the size of the library better than the HiSeq4000. The data yield would be lower.
  4. Just use the data as is - the sequencing quality is still quite good - maybe consider trimming, but how much should be trimmed?

Appreciate any impute from experienced eyes.

EDIT: I'm expecting around 20X coverage (150bp read length, paired end, 250M reads per sample (125M per fq), 3GB genome)

Thanks, Kenneth.

hiseq4000 qc wgs • 223 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by YaGalbi1.4k

quality plots look according to expectations to me. I would be tempted to go ahead with it as it is

ADD REPLYlink written 6 months ago by lieven.sterck3.3k

exactly what i was thinking - it could be better, but it really isn't that bad.

ADD REPLYlink written 6 months ago by YaGalbi1.4k

Before thinking of spending more money/effort (sequencing) you should always go ahead and analyze data you have in hand. As others have said this is not looking too shabby.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax59k

Yes, I've started the pipeline for the data, should have the bams tomorrow and the vcfs on friday. thank you.

ADD REPLYlink written 6 months ago by YaGalbi1.4k

How to add images to a Biostars post

ADD REPLYlink written 6 months ago by WouterDeCoster35k

Edited ... thank you

ADD REPLYlink written 6 months ago by YaGalbi1.4k

Sequencing depth will be another important consideration to do the things you mentioned (especially SNPs). If you have a lot of depth, then I think the quality looks fine.

ADD REPLYlink written 6 months ago by goodez420
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2241 users visited in the last hour