Question

Assembling Sequences Without The Fastq File

2

Entering edit mode

12.9 years ago

Saad Khan ▴ 440

I have been given a fasta file having short map reads obtained from some next generation sequencing method i have not been provided with the fastq file.What would be the best tool or best way to assemble these sequences either denovo or using a reference database. I was looking at bioperl howto : Short-read assemblies with BWA but it seems that it requires fastq files too. Please let me know the best way to go about in getting the best assembly possible.

assembly • 3.8k views

ADD COMMENT • link updated 12.9 years ago by Michael Schubert ★ 7.1k • written 12.9 years ago by Saad Khan ▴ 440

0

Entering edit mode

BWA maps reads to a reference, and you should be able to do that by just adding fake qualities, like Pierre suggests - possibly using decreasing quality towards the end of the reads. Many de novo assemblers ignore quality anyway (typically using de bruijn graph assembly), but I haven't been able to get very good results from them.

ADD REPLY • link 12.9 years ago by Ketil 4.1k

Ram · Answer 1 · 2011-05-31

3

Entering edit mode

12.9 years ago

Pierre Lindenbaum 161k

An idea: if you don't have the qualities of your reads(!), you could create a dummy fastq file with an average quality for all the bases and then use the standard software for assembling the genome ?

ADD COMMENT • link updated 4.6 years ago by Ram 43k • written 12.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

So one way I can see it is that first i should make a Qual file with a numerical value of 50 for each nucleotide and then merge sequence and quality files to FASTQ using this code http://www.bioperl.org/wiki/Merging_separate_sequence_and_quality_files_to_FASTQ

ADD REPLY • link 12.9 years ago by Saad Khan ▴ 440

Ram · Answer 2 · 2011-05-31

1

Entering edit mode

12.9 years ago

Michael Schubert ★ 7.1k

Another idea: contact your supplier and ask for the fastq file.

If that's not an option, MIRA can do assemblies on fasta files. Velvet too, if I recall correctly.

ADD COMMENT • link 12.9 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

Btw, I'm aware that BWA aligns to reference genomes and those are primarily de novo. But you haven't told us if you plan to align reads to reference, contigs to reference, or have a reference at all ;-)

ADD REPLY • link 12.9 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

I do plan to use a reference genome. What if I do what Pierre suggested by making a Qual file with a numerical value of 50 for each nucleotide and then merge sequence and quality files to FASTQ using bioperl code (possibly decreasing quality at the end).Which one would be better using MIRA or using hypothetical quality

ADD REPLY • link 12.9 years ago by Saad Khan ▴ 440

0

Entering edit mode

Actually I am doing the assembly of a baculovirus, there are many genomes available at NCBI belonging to family baculoviridae. Are there specific assemblers for prokaryotic or small organisms, currently I am trying my hand with minimus which is a part of AMOS package. Please let me know what all I can do with the assembled genome.

ADD REPLY • link 12.9 years ago by Saad Khan ▴ 440

0

Entering edit mode

http://en.wikipedia.org/wiki/Sequence_assembly#Available_assemblers

ADD REPLY • link updated 4.6 years ago by Ram 43k • written 12.9 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

Can you please provide me a link to a good tutorial on velvet, there is one btw in "Current Protocols in Bioinformatics" but I don't have access to it,If you have it please mail me at skm770@gmail.com