Which PacBio outputs to use for de novo assembly?
1
0
Entering edit mode
2.9 years ago
tbeignier • 0

Hello, I was given all the outputs of the result of a whole genome sequencing of a bacteria, and there is a lot of them and I don’t understand which one of these I should use to do a novo assembly:

• 3 bax.h5
• 1 bas.h5
• 1 mcd.h5

I was thinking using SMRT link to do the analysis , but I don’t know which one of these I should use. Thanks in advance.

PacBio Novo assembly Outputs assembly • 1.2k views
1
Entering edit mode

There are other options like flye (https://github.com/fenderglass/Flye ) and canu (https://github.com/marbl/canu) that may be better than SMRTlink and they can use the fastq/fasta files.

3
Entering edit mode
2.9 years ago

Most (if not all) long-read assembler software you will come across will require the files called subreads. Depending on the tool used it needs to be .fasta or .fastq.

You mean the HGAP assembler of SMRT link? that apparently needs bam files as input (see also here ) , you should have those as well btw, they are part of the default output of the PacBio machines/protocol. however, you can easily convert the fastq to bam though

Keep in mind though that this is a merely technical issue as (at least for the recent PacBio data), the quality-values in the fastq files have no meaning anymore as they are no longer used in the context of PacBio.

0
Entering edit mode

Yes it's HGAP, the thing is that I don't understand what file give in the file manager doesn't correspond to what I have ( Or i did't understand something --')

0
Entering edit mode

Since you are not importing from SMRT link server you will have to choose local file system and upload all the files?

0
Entering edit mode

It ask for Barcodes(Fasta or Zip) OR References(Fasta or Zip) only

0
Entering edit mode

I no longer have access to a SMRTlink install but looking at the manual (page 30) it looks like you will need to get the XML files from your provider since your install is not linked to the instrument. This sounds like RSII data (not sequel)?

Considering that you may want to move ahead with flye/canu for now while you wait to get the right data from the sequence provider.

0
Entering edit mode

I have no hands-in experience with SMRT link toolbox but I would already suggest to go for the local file system option (in stead of the SMRT link server one ).

the SMRT link toolbox is a huge box with many different types of analysis in