Assembling Sequences Without The Fastq File
2
2
Entering edit mode
12.9 years ago
Saad Khan ▴ 440

I have been given a fasta file having short map reads obtained from some next generation sequencing method i have not been provided with the fastq file.What would be the best tool or best way to assemble these sequences either denovo or using a reference database. I was looking at bioperl howto : Short-read assemblies with BWA but it seems that it requires fastq files too. Please let me know the best way to go about in getting the best assembly possible.

assembly • 3.8k views
ADD COMMENT
0
Entering edit mode

BWA maps reads to a reference, and you should be able to do that by just adding fake qualities, like Pierre suggests - possibly using decreasing quality towards the end of the reads. Many de novo assemblers ignore quality anyway (typically using de bruijn graph assembly), but I haven't been able to get very good results from them.

ADD REPLY
3
Entering edit mode
12.9 years ago

An idea: if you don't have the qualities of your reads(!), you could create a dummy fastq file with an average quality for all the bases and then use the standard software for assembling the genome ?

ADD COMMENT
0
Entering edit mode

So one way I can see it is that first i should make a Qual file with a numerical value of 50 for each nucleotide and then merge sequence and quality files to FASTQ using this code http://www.bioperl.org/wiki/Merging_separate_sequence_and_quality_files_to_FASTQ

ADD REPLY
1
Entering edit mode
12.9 years ago

Another idea: contact your supplier and ask for the fastq file.

If that's not an option, MIRA can do assemblies on fasta files. Velvet too, if I recall correctly.

ADD COMMENT
0
Entering edit mode

Btw, I'm aware that BWA aligns to reference genomes and those are primarily de novo. But you haven't told us if you plan to align reads to reference, contigs to reference, or have a reference at all ;-)

ADD REPLY
0
Entering edit mode

I do plan to use a reference genome. What if I do what Pierre suggested by making a Qual file with a numerical value of 50 for each nucleotide and then merge sequence and quality files to FASTQ using bioperl code (possibly decreasing quality at the end).Which one would be better using MIRA or using hypothetical quality

ADD REPLY
0
Entering edit mode

Actually I am doing the assembly of a baculovirus, there are many genomes available at NCBI belonging to family baculoviridae. Are there specific assemblers for prokaryotic or small organisms, currently I am trying my hand with minimus which is a part of AMOS package. Please let me know what all I can do with the assembled genome.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Can you please provide me a link to a good tutorial on velvet, there is one btw in "Current Protocols in Bioinformatics" but I don't have access to it,If you have it please mail me at skm770@gmail.com

ADD REPLY
0
Entering edit mode

Velvet has an excellent documentation on the website. Just look it up.

ADD REPLY

Login before adding your answer.

Traffic: 2893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6