Question

Beginner help with sequencing data

1

Entering edit mode

5 months ago

camhabib ▴ 10

Sorry for the extremely basic nature of this question but I'm a bit lost here despite spending a couple days reading and troubleshooting.

I recently submitted a bacterial strain for WGS via Nanopore sequencing. What I got back was a single FASTA file containing all my reads, and a folder with contigs (4 total, 2 of them >1M bases, 2 <10k bases). When I try to align the contigs to my reference genome using something like Geneious or Snapgene, I get either an error saying that they could not align, or very poor coverage of the genome (<10%). When I search the contigs for <30bp snippets of reference genome sequence, and visa versa, I find that the genome is almost completely covered by the contigs. I've found a number of tools, but most require different file types (BAM being one) or aren't designed to take contigs, just raw reads.

Essentially, what I'm trying to achieve is a list of base changes in my strain vs the reference genome, and if they correspond to an amino acid change. Any help in working towards this goal would be very much appreciated.

analysis snv nanopore sequencing • 387 views

ADD COMMENT • link updated 5 months ago by GenoMax 141k • written 5 months ago by camhabib ▴ 10

score 1 · Answer 1 · 2023-11-01

What I got back was a single FASTA file containing all my reads

You have this tagged as nanopore. Did they not give you the original fastq format reads? It would be easier to do the SNP calling by starting with the raw fastq data to generate the alignment BAM files (you should be able to do that in Geneious). If the assembly you received is good you may be able to align that using the same aligner you would use (minimap2) but this will need to be done on the command line (if you are familiar with UNIX/linux).