Question: Converting Illumina Data From 'Gerald' To Sam
2
gravatar for Pierre Lindenbaum
9.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

Hi all, I've been given an external hard drive containing a set of files mapped with some Illumina Tools (GERALD ?).

tail -3 s_3_2_export.txt
HWUSI-EAS454    14    3    120    16534    21510    0    2    TGTNNNNNNTNNTAACNNTNNNGGNNCNNTNNNCNNNNNCNNNNNNNCNNNANNGCNTNNTNCANNNCNCNNNTNC    BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB    QC                                            N
HWUSI-EAS454    14    3    120    17941    21505    0    2    ACTNNNNNNCNNGCTANNANNNNCNNNNNTNNGANNNNNTNNNNNNNCNNNNNNATNCNNTNNTNNNCNCNNNCNA    BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB    QC                                            N
HWUSI-EAS454    14    3    120    18005    21512    0    2    CTANNNNNNGNNTTCTNNTNNNNANNNNNCNNTANNNNNCNNNNNNNGNNNNNNCANCNNGNNCNNNCNANNNCNC    BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB    QC

where can I find a description of those columns ?

how can I convert it to SAM/BAM ?

thanks.

illumina sam conversion • 2.7k views
ADD COMMENTlink written 9.1 years ago by Pierre Lindenbaum124k
1

Just to add, some people call this format the "export" format, as the filenames tend with "_export".

ADD REPLYlink written 9.1 years ago by Bio_X2Y3.7k
8
gravatar for Aaronquinlan
9.1 years ago by
Aaronquinlan11k
United States
Aaronquinlan11k wrote:

In the misc/ directory of the samtools distribution, there is a script called "export2sam.pl" which should do the trick. See this thread.

ADD COMMENTlink modified 9.1 years ago • written 9.1 years ago by Aaronquinlan11k
5
gravatar for Bio_X2Y
9.1 years ago by
Bio_X2Y3.7k
Ireland
Bio_X2Y3.7k wrote:

aaron's post covers a tool for the conversion. To answer the other part of your question, here are the descriptions from the GA Pipeline 1.4 documentation provided to us by Illumina (I don't think it's a public document). The format might be different for other versions of the Pipeline.

  • Machine (as parsed from run folder name)
  • Run Number (as parsed from run folder name)
  • Lane
  • Tile
  • X Coordinate of cluster
  • Y Coordinate of cluster
  • Index String (blank for a non-indexed run)
  • Read Number ("1" or "2" for paired end, blank for a single end)
  • Read
  • Quality String - in symbolic ASCII format (ASCII character code = quality value + 64)
  • Match Chromosome - name of chromosome match was to OR code indicating why no match was done
  • Match Contig (blank if no match found) - gives contig name if there is a match and the match chromosome is split into contigs
  • Match Position (always with respect to forward strand, numbering starts at 1)
  • Match Strand ("F" for forward or "R" for reverse, blank if no match)
  • Match Descriptor - concise description of alignment. A numeral denotes a run of matching bases, a letter denotes substituation of a nucleotide, so e.g. for a 35 base read, "35" denotes an exact match and "32C2" denotes substitution of a "C" at the 33rd position
  • Single Read Alignment Score - alignment score of single read match (if a paired read, gives alignment score of read if it were to be treated as a single read)
  • Paired Read Alignment Score - alignment score of read pair (alignment score of a paired read and its partner, taken as a pair. Blank for a single read run)
  • Partner Chromosome - not blank only if read is paired and its partner aligns to another chromosome, in which case it gives the name of the chromosome
  • Partner Contig - not blank only if read is paired and its partner aligns to another chromosome and that partner is split into contigs
  • Partner Offset - if a paired read's partner hits to the same chromosome (as it will in the vast majority of cases) and contig (if the chromosome is split into contigs) then this number added to Match Position gives the alignment position of the read's partner
  • Partner Strand - which strand did the paired read's partner hit to("F" for forward or "R" for reverse, blank if no match)
  • Filtering. Did the read pass quality filtering? "Y" for yes, "N" for no
ADD COMMENTlink modified 9.1 years ago • written 9.1 years ago by Bio_X2Y3.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 893 users visited in the last hour