Question

Ion Torrent BAM and FASTQ

0

Entering edit mode

3.0 years ago

Rajitha • 0

Hi all,

Im new to the ION Torrent technology and need your help to figure this out. The Ion GeneStudio™ S5 System generates BAM files as its primary output. when i need a particular FASTQ file, i can use the FileExporter plugin and get the relevant FASTQ file. My questions are

Does the system generates a FASTQ file when generating the BAM file from WELLS file? (Illumina produce the FASTQ file as the raw data and we aligned them later)
is the generated BAM file used by FileExpoter plugin to generate the FASTQ ?
Can anyone explain the process between the WELLS file>BAM file?

Thanks a lot.

NGS TorrentBaseCaller FASTQ BAM IonExporter • 4.1k views

ADD COMMENT • link updated 3.0 years ago by 5heikki 11k • written 3.0 years ago by Rajitha • 0

1

Entering edit mode

FileExporter converts bam files to fastq files

ADD REPLY • link 3.0 years ago by 5heikki 11k

0

Entering edit mode

Thank you for the reply. I have go through the link and found the function

bam2fastq_command(BAMName, FASTQName)

which converts the BAM into a FASTQ file.

 def bam2fastq_command(BAMName, FASTQName):
    com = "java -Xmx8g -jar %s SamToFastq" % ion.picardPath
    com += " I=%s" % BAMName
    com += " F=%s" % FASTQName
    return com

Hence FileExporter pugin uses the GATK and Picard tools.

ADD REPLY • link 3.0 years ago by Rajitha • 0

0

Entering edit mode

You can convert bam file to fastq file by using Samtools

samtools bam2fq input.bam > output.fastq

ADD REPLY • link 3.0 years ago by MSRS ▴ 590

0

Entering edit mode

Thank you for the information. I can convert the BAM into FASTQ but i really needs to know whats happen inside the Fileexporter plugin and Torrent Basecaller.

ADD REPLY • link 3.0 years ago by Rajitha • 0

score 2 · Answer 1 · 2021-08-17

The reason tools create BAM files instead of FASTQ is that each alignment in a BAM file can carry additional information via the so-called "tags".

For Illumina reads the metadata on each reads crammed into the read name (well number, spot number, date etc).

That being said, it is evidently absurd to use a Sequence Alignment(!) Format, to store non-aligned reads, but hey, when it comes to bioinformatics, absurdity is the new normal.

In your BAM file each read is tagged with metadata in the form of TAG:TYPE:value the documentation for the sequencer describes what each tag means.

Long story short, when you convert to FASTQ some metadata may be lost, if you need to filter your data in ways not offered by a tool you can probably do it by parsing the output.