Question: looking for a free command-line based quality score tool for sanger sequencing
0
gravatar for hfan22
4 months ago by
hfan2210
hfan2210 wrote:

Dear all,

I got some 16s (forward and reverse) sanger sequencing data (from ABI 3730xl DNA) from our on campus sequencing facility. They came in abi format but the quality scores have not been applied. When I asked for quality scores, the staff at the sequencing facility used KB Basecaller but also warned me that those scores are inflated. The staff recommended Staden, which gives quality score close enough to true Phred score, but it is not command-line based, therefore not good for batch processing. Phred is command-line based but it is not free. It seems like the Phred score calculation is patented (https://www.google.ch/patents/US6681186) and maybe that's why it is hard to find a tool?

Ideally I would like to have the quality score applied to those trace files so I can start the trimming and merging process. I've only worked with next-generation sequencing data and was always given fastq files. Therefore I was also wondering whether it is normal to be provided with trace files without quality score applied.

Any advice is appreciated.

ADD COMMENTlink modified 4 months ago • written 4 months ago by hfan2210
1

Phred is command-line based but it is not free.

Only if you are a commercial user.

Out of curiosity how many files do you have and is a command line tool must? You may already have access to one of the several commercial programs that can handle .ab1 files (e.g. DNASTAR, Vector NTI, Sequencher etc) via your institution.

ADD REPLYlink written 4 months ago by genomax37k

Thank you h.mon! Yes it's free for academic use. I should have read the Phred page more carefully (http://www.phrap.org/consed/consed.html#howToGet). Is there anything I could do to minimize future misunderstanding, like editing, or deleting my original post?

I don't have a lot samples, 50-ish, but I prefer not doing things manually.

ADD REPLYlink written 4 months ago by hfan2210

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLYlink written 4 months ago by WouterDeCoster23k

Sorry WouterDeCoster. I will follow the instruction next time.

ADD REPLYlink written 4 months ago by hfan2210
1
gravatar for h.mon
4 months ago by
h.mon9.2k
Brazil
h.mon9.2k wrote:

You can use Staden at the command-line, with the -nowin flag. Scavenging my old stuff I found this:

pregap4 -nowin -config ~/seqs/pregap.conf -fofn $NAME".files" > $NAME".output"

If my memory is correct, I used pregap once with the Tk GUI, configured as needed and saved the conf. After that, I could run with the above command-line, editing the conf file as needed.

Keep in mind that, while Staden base-calling is reasonable, Phred beats it by a somewhat large margin - and Phred is free for academic use.

edit: besides, I do not think the claim KB Basecaller produces inflated scores is correct, either from my experience or from the literature: A direct comparison of the KB™ Basecaller and phred for identifying the bases from DNA sequencing using chain termination chemistry.

ADD COMMENTlink modified 4 months ago • written 4 months ago by h.mon9.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 701 users visited in the last hour