Question: FASTQC and PacBio reads
3.5 years ago by
United States
I'm working with a group that did some PacBio sequencing to aid in the assembly of some bacterial genomes.


The first priority for this group is to assess the quality of the read data. One of the formats that the read data were returned in was of fastq. We already have a pipeline for fastqc, so I first tried running the read files through our fastqc pipeline. However, fastqc keeps crashing due to java out of memory/heap space error. I have two questions:


One, how do I increase the java memory allocation for fastqc?


Two, even though the PacBio reads are in the form of fastq files, should I even be using fastqc? Is there a better program? This is my first time handling PacBio data, so I'm sorry if these are very basic questions.

3.5 years ago by
Walnut Creek, USA
I don't think fastqc is appropriate for PacBio data.  None of the charts would be useful, in my opinion, even if it did work.

Hi Brian,

So, would you please recommend some pipeline/package to assess the quality of the PacBio reads? I have also started with FastQC and indeed it didn't provide much useful information.



The SMRT Pipe utilities should be of use for this, though I don't use them so I'm not sure which ones are appropriate.  But I recommend you look there first.  If you want an empirical analysis of the error rates, I suggest you map the filtered subreads against a reference or assembly using the BBMap package.  For example: in=reads.fastq ref=reference.fasta maxlen=2000bp minlen=100bp mhist=mhist.txt idhist=idhist.txt indelhist=indelhist.txt qhist=qhist.txt qahist=qahist.txt bhist=bhist.txt covhist=covhist.txt

In addition to the histograms, the stderr output from the process will give a summary of the overall read quality in insertion, deletion, substitution, and match rates.

3.5 years ago by
-Xmx is the option to increase the memory available to JVM (i.e java programs)

3.5 years ago by
United States
Please start from the bax.h5 files instead of the fastq files. You can find the documents here and here . Also look into the HGAP assembly method / protocol .

Hope that helps.

