Hello everyone,
I’m a master’s student working on a chloroplast genome assembly project. The sequencing data were generated about 4–5 years ago. Unfortunately, the company never provided us with the Illumina raw data — I only have PacBio reads available now.
I’d like to ask:
- Is it feasible to assemble a complete chloroplast genome only using PacBio data?
- Will the absence of Illumina reads significantly affect assembly quality or downstream analyses (such as gene annotation and comparative genomics)?
- Would such a project still be considered substantial enough for a master’s thesis?
For context:
I’m relatively new to bioinformatics.
My lab mainly focuses on classical taxonomy, so I don’t have many local peers familiar with genome assembly.
The dataset is from a plant species (chloroplast genome expected ~150 kb).
Any advice on strategy, software suggestions, or similar experiences would be greatly appreciated.
Thank you very much in advance!
Can you find related chloroplast genomes (there must be some in NCBI) and try aligning the data your have?
It may be possible to do the assembly with what you have. shelkmike may have some suggestions.