Question: What are the ".bax.h5" files generated by PacBio long reads sequencing ?
gravatar for Rox
4.0 years ago by
France / Toulouse / GeT-Plage
Rox1.2k wrote:

Hi everyone !

I'm new to PacBio long reads sequencing and I've read a lot about what exactly contains the raws files produced by this type of sequencer. I understand that the two different tpyes (.bas.h5 and .bax.h5) refer to what each file contains (sequence, quality value, information about the chemistry used...).

But as a beginner, I still don't understand how to transform theses .bax.h5 files into a subreads.fastq files, and also I don't know what exactly to give to a assembly pipeline (I'm using Falcon), should I give a fastq file or a .bax.h5 file ?

I've got the same problem for the polishing step with Quiver that require the quality informations contained in original files.

I really need some explanations about that, if you could please give me some advices !



sequencing assembly • 3.3k views
ADD COMMENTlink modified 4.0 years ago by Pierre Lindenbaum129k • written 4.0 years ago by Rox1.2k
gravatar for Pierre Lindenbaum
4.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

These are HDF5 files : you can extract those files to fastq using (googling... ) see

ADD COMMENTlink written 4.0 years ago by Pierre Lindenbaum129k

Thanks for your answer ! I was looking for such a tool but didn't found it... Do you also know what file should be use for genome assembly ? The HDF5 files or the fastq file ?

ADD REPLYlink written 4.0 years ago by Rox1.2k

Multiple options. Current recommendation seems to be canu (I think you have plenty of coverage if I remember other threads you have posted).

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by genomax85k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1478 users visited in the last hour