NGS Sequencing Outputs
1
0
Entering edit mode
3.6 years ago
speycast • 0

Hello,

I'm new to bioinformatics. I'd like to know the outputs formats that come directly out of illumina sequencers. For PE reads, after sequencing, what is the output formatted in? is it bcl and from bcl it goes to fasta --> fastq --> SAM --> BAM --> VCF? is this correct? and which at step is demux, is demux referring to demultiplexing the fastq file to get Q30 scores?

Thanks very much for your help and time in advance!

NGS illumina fastq fasta bam • 1.2k views
ADD COMMENT
6
Entering edit mode
3.6 years ago
GenoMax 141k

Usable output (for an end-user) from an Illumina sequencer is fastq formatted gzip compressed sequence files. They are produced by processing the basecalls with a program called bcl2fastq (soon a new program called bclconvert) that is supplied by Illumina. Sequencing can sample only one end of the library fragments being sequence (single-end sequencing) or it can sample both ends (paired-end sequencing).

Demultiplexing happens as a part of bcl2fastq processing (generally). When more than one samples tagged with a index sequence (or a pair) is present in the sequenced pool, sample data are binned into individual data files as a part of bcl2fastq processing using these indexes. In Illumina sequencing index reads are never a part of actual sequence and the order of sequencing is Read 1 --> Index 1 (if present) --> Index 2 (if present) --> Read 2.

Rest of the files you are referring to are derived data files produced after alignments (fastq --> SAM --> BAM), variant analysis (BAM --> VCF). Fastq files can sometimes be converted to plain fasta format (especially if you need to blast them). Fastq files can also be converted to BAM files without alignments (Unaligned BAM files).

ADD COMMENT

Login before adding your answer.

Traffic: 1498 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6