Question: Can 167 contigs in my genome be stitched in to one fasta sequence
17 months ago by
Is it fine if I stitch all the 167 contigs from my genome into 1 single fasts file?

Are there any tools which can be employed for this purpose.

I have a strain of bacteria with 167 contigs and I want to stitch them up together under 1 single FASTA header.

Did you assembly this genome? Can you give more details about sequencing and assembly? My experience with bacterial genomes is if you have good coverage - starting 20-30x up to 100x of Illumina MiSeq - you should have good assemblies, with not so many contigs. One common cause to fragmented assemblies is contamination: although we believed we were sequencing a single, isolated strain, in fact there was a second strain in the culture. You can use tools such as blobtools to investigate your assembly.

This particular genome I was referring to was downloaded from NCBI. The reason why I was asking about stitching was that some tools like PHASTer warrants the use of Complete genome (not contigs or scaffolds) to detect the presence of prophages and annotate their position in a circular genome.

17 months ago by
Bergen, Norway
It's not illegal, but that was possibly not what you meant ;) For most other use-cases it is definitely not fine. I'll just give you some hints:

  • You cannot tell which sequence is belonging to which contig anymore
  • You don't know the natural order of contigs
  • You are creating artificial chimeric sequences at the transitions
  • Sequences can come from different replicons
  • ....

  • You can possibly try to scaffold the contigs if there are additional long-range sequencing libraries (Fosmid, pacbio, ...) (SSPACE is the program)

  • You can keep the contigs as they are in a single multi-fasta file
  • If you still think you need to do this, you need to re-think the reason. There is probably a problem with your approach.
