I am involved in the genome project of a insect species. The genome size of our species is estimated to be 500Mb and I have 100x illumina short read data and 50x Pacbio Sequel long read data.
I have two questions:
- I’m going to take hybrid assembly strategy. I thought ALPACA pipeline (https://github.com/VicugnaPacos/ALPACA) suits our situation.
I, however, realized that ALPACA uses ALLPATHS-LG inside but we don’t have the fragment library for ALLPATHS-LG.
Is there any better alternative pipeline?
- I am totally new to the PacBio data. Can I directly use subreads.bam files for assembly? Or do I have to take quality control steps?
Any comments would be appreciated.