Exporting Raw Trace Data
6
7
Entering edit mode
11.5 years ago
Gregory ▴ 90

Hi all,

Is there any software out there that can export raw traces from .ab1 or .scf files as coordinates? By this I mean extracting a table of amplitude at each horizontal pixel (raw X-Y data) so that I can reconstruct the trace in Excel or some other graphing program. Thanks in advance for your insight.

Gregory

sequencing • 8.5k views
8
Entering edit mode
11.5 years ago
Neilfws 49k

As answered by lh3, io_lib is probably the best option. Applied Biosystems are one of those companies that like to deal in proprietary formats that no-one else can read easily.

You may also be able to use abiview from the EMBOSS suite, using the -graph data option:

abiview -graph data -outseq myseq.fa myseq.ab1


When I run this on a sample .ab1 found on my hard drive, it creates 4 ASCII files, named abiviewN.dat, where N is 1-4. They seem to contain raw coordinates (a few sample lines):

403.333344      6.000000
403.416656      5.000000
403.500000      4.000000
403.583344      3.000000
403.666656      2.000000
403.750000      2.000000
403.833344      3.000000
403.916656      6.000000
404.000000      8.000000
404.083344      10.000000


I'm not clear why 4 files are generated. However, once you work it out, it should be easy enough to parse out the data for plotting in other packages.

1
Entering edit mode

Your abiview suggestion is fantastic. The Windows port (mEMBOSS) even lets me do it right at the Windows command prompt. The four files correspond to each of the dye terminators (for A, C, G, and T). Thanks again!

0
Entering edit mode

Good to hear. I thought the files might be A,C,G,T but was a little confused by their content - glad you worked it out.

0
Entering edit mode

@Gregory: How do we interpret the output from abiview abve? What order are the 4 data files in (GATC or ACGT)? Any assistance or links to resources for this would be useful.

0
Entering edit mode

The order is G,A,T,C by default. You can change the selection using the -bases option.

0
Entering edit mode

Anyone able to give any more information on the file format of these four files output from abiview? I've searched online and can find nothing. I compared the content of these four files with a trace, but found nothing helpful. Do we even know if the content of these four files is "correct"?

6
Entering edit mode
11.5 years ago
lh3 33k

io_lib. When you compile it, you will find a few programs in the progs/ directory. Probably one of those "*_dump" executables is for you.

If you want to do serious things, learning to use io_lib is recommended. The APIs are quite straightforward.

3
Entering edit mode
11.5 years ago
Malcolm.Cook ★ 1.3k

Hmmm.... I don't think the emboss abiview fits the bill as requested, but I'll leave that up to the OP. In any case....

If you are after raw intensity at each sequencer scan, you can generate a table like this:

[?]

using perl modules:

• ABIF - "Perl extension for reading and parsing ABIF (Applied Biosystems, Inc. Format) files"
• Array::Transpose - "Transposes a 2-Dimensional Array"

like this:

# install the perl modules:
> sudo cpan Bio::Trace::ABIF Array::Transpose

# perl one-liners (which arguably should be in a script):
> perl -MBio::Trace::ABIF -MArray::Transpose -e 'BEGIN{$abif=Bio::Trace::ABIF->new($_); $,="\t",$\="\n"}; $abif->open_abif(@ARGV) or die "can not open"; print @$_ foreach transpose ([map {[$_,$abif->raw_trace(\$_)]} qw(A T G C)])' my.ab1 > my_ab1_trace.tab


--Malcolm

0
Entering edit mode

Your solution is considerably more elegant and efficient than running abiview and manually merging the four output files, to be sure. All the more reason to learn perl...

0
Entering edit mode

sir thank you for this but its generating the values for 16000 bases even my sequence is no longer than 805 bp

0
Entering edit mode

Hello, i tryed to use it, but got problems understanding it.

I posted here for some help : C: How to get ambiguous sequence from ABIF file with perl ?

But i think you are the most fited to help me. Can u please ? =D

2
Entering edit mode
11.2 years ago
Mac Cowell ▴ 20

If you want to roll your own .abi file parser, this technical examination of the format by Clark Tibbetts could be helpful, if a little dated (circa 1995):

"Raw Data File Formats and the Digital and Analog Raw Data Streams of the ABI PRISM™ 377 DNA Sequencer." Here is the postscript original and an html version by google.

0
Entering edit mode

0
Entering edit mode

I'd just like to add to that. There is an official file format specification: http://www6.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf

1
Entering edit mode
11.5 years ago

There are a lot of tools (even free ones) for manually checking traces, you would most likely be better of using one of those than trying to do your analysis in Excel.

Probably the easiest way is to go with the Sequence Scanner Software directly from Applied Biosystems (for Windows, free). It can print, edit, and export your chromatograms.

There is also a list of available tools for this at the University of Michigan's Website.

0
Entering edit mode

Thanks for your suggestions. The software that I looked into, including AB's Sequence Scanner Software, exports traces only in graphical form (jpg, pdf, etc). My reason for wanting the raw coordinates actually has little to do with base calling, but to calculate peak areas. I agree that few people would actually want the coordinates, but I'm really hoping that some something out there would offer this feature.

0
Entering edit mode
10.3 years ago
Dryice • 0

I have the same question as I could not succeed in export raw abi data to excel. My goal is to calculte area under each peak, so any more suggestions are appreciated.