Question: Is the sequence quality good enough?
0
gravatar for YaGalbi
14 months ago by
YaGalbi1.4k
Biocomputing, MRC Harwell Institute, Oxford, UK
YaGalbi1.4k wrote:

Hi all,

Please take a look at these sequence quality histograms from fastqc.

Sample 1: Sample 1

Sample 2:Sample 2

This is WGS data sequenced on illumina HISEQ4000. We intend to call snps and indels and possibly structural variants. In the future we may even use the data set for imputation.

I have 4 options and I'm not really experienced enough to make the call but I'd like some informed opinions

  1. Perform another size selection step to narrow the spread in the library pool so the HiSeq4000 can accommodate without read2 quality dropping as it did in the first run. We have QC’ed the library following a second round of sizing and it does look much better in terms of suitability for the HiSeq4000. However, 10X do not recommend this due to the fear of losing diversity in the library.
  2. Run the library again on the HiSeq4000 with adjusted loading to improve overall yield. The likelihood here is that the read2 issue will continue.
  3. Run the library on the NextSeq500. This is an unknown but it is believed this could accommodate the size of the library better than the HiSeq4000. The data yield would be lower.
  4. Just use the data as is - the sequencing quality is still quite good - maybe consider trimming, but how much should be trimmed?

Appreciate any impute from experienced eyes.

EDIT: I'm expecting around 20X coverage (150bp read length, paired end, 250M reads per sample (125M per fq), 3GB genome)

Thanks, Kenneth.

hiseq4000 qc wgs • 395 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by YaGalbi1.4k
3

quality plots look according to expectations to me. I would be tempted to go ahead with it as it is

ADD REPLYlink written 14 months ago by lieven.sterck5.6k

exactly what i was thinking - it could be better, but it really isn't that bad.

ADD REPLYlink written 14 months ago by YaGalbi1.4k
3

Before thinking of spending more money/effort (sequencing) you should always go ahead and analyze data you have in hand. As others have said this is not looking too shabby.

ADD REPLYlink modified 14 months ago • written 14 months ago by genomax70k
1

Yes, I've started the pipeline for the data, should have the bams tomorrow and the vcfs on friday. thank you.

ADD REPLYlink written 14 months ago by YaGalbi1.4k
1

How to add images to a Biostars post

ADD REPLYlink written 14 months ago by WouterDeCoster40k

Edited ... thank you

ADD REPLYlink written 14 months ago by YaGalbi1.4k
1

Sequencing depth will be another important consideration to do the things you mentioned (especially SNPs). If you have a lot of depth, then I think the quality looks fine.

ADD REPLYlink written 14 months ago by goodez460
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 659 users visited in the last hour