Conversion in fasta and sequence analysis
0
0
Entering edit mode
5.4 years ago

Dear all, I am a beginner in sequencing analysis, and I am sorry for any incorrect query. I have sequenced some bacterial genome obtaining fastq files. To process this file I have used illumina basespace app: prokka and BWA (to align sequences with the reference genome) obtaining the .bam file. From .bam file I have obtained the fasta.file but it presents multiple sequences (contings). I used samtools. Can I generate a single sequence. I should search on the obtained sequence a specific region to find SNPs or indels.

despite the early question, Is this pipeline correct? or I have to modify it???

I am a microbiologist and until now I have used GUI software to obtain information and never this kind of programs

Thank you a lot

sequencing • 733 views
ADD COMMENT
0
Entering edit mode

There are a few misunderstandings in your post.

Prokka is not part of aligning/assembly - only annotation of the finished genome. Have you actually run an assembly step (using SPAdes, Velvet, SOAP etc? Do any of these names seem familiar?)

The short answer is that with illumina data only (like you have) its highly unlikely you'll get a single sequence for any but the shortest, simplest genomes. You would need to do hybrid assembly with a long read technology to 'close' or finish the genome. Without that, or manual primer walking of gaps, multifasta/multi-genbank (if annotated) is as good as it gets.

That might be perfectly fine for finding some of your mutations of interest though.

ADD REPLY

Login before adding your answer.

Traffic: 2573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6