Question: If I Order Wgs From A Vendor (Eg. Illumina), What Files Do I Get Back From Them?
gravatar for user56
6.9 years ago by
United States
user56290 wrote:

I am new to bioinformatics. so this may be silly question.

We would like to do Pharmacogenomics clinical decision support. Instead of using genotyping, we want to use whole genome sequencing. I am assuming we will send 8 patient samples for analysis. (ideally we wound sent just one, but the pricing is such that we have to send a batch). There are lots of format and it is hard to know where to start, what we will have to do in-house and what can we get our of the box (going only to sequencing vendor, not interpretation vendor)

My questions are:

  1. in what format we can expect to get the raw data? (would that be fastaq or BAM/SAM)

  2. does one get right away a *.vcf file from a vendor or we will have to process it ourselves? (against which reference genome it would be done - would that be Homo sapiens GRCh37 Primary Assembly?

ADD COMMENTlink modified 4.3 years ago by Biostar ♦♦ 20 • written 6.9 years ago by user56290
gravatar for Leonor Palmeira
6.9 years ago by
Leonor Palmeira3.7k
Li├Ęge, Belgium
Leonor Palmeira3.7k wrote:

The short answer would be:

  • for an Illumina sequencing, you should end up with a .fastq file containing the read sequences and their qualities.
  • you might get a BAM/SAM file if you ask (and pay) your sequence provider to map these reads to a reference genome.
  • .vcf files are SNP variant files generated by GATK. This step might also be done by your sequencing provider if you ask/pay for it.

This being said, I believe a good sequencing provider should neither be selling you solely the output of a sequencing machine, nor the output of a mysterious processing pipeline. If this is the case, you should seek a more professional sequencing provider. He should for instance help you in the following steps:

  • building your experimental plan (do you need 1 patient sample, or 8 patient samples?). Once you will get the data, how will you process this information?
  • determine the analyses that need to be done on your data so that it is then interpretable given your level of expertise (by this I mean, not give you the raw data if you have no idea/experience on what should be done next).
  • should these analyses be done by them, or can they be done by you or by another company?

All of this will help you make the most of the money/time/energy you will be spending on this genotyping projet.

ADD COMMENTlink written 6.9 years ago by Leonor Palmeira3.7k

Definitely spend time looking at different vendors and sequencing centres to see what they give you and what support services they provide. If you are new to Bioinformatics you aren't going to want to go with the cheapest provider, because trust me.... the level of support just won't be there. WGS is a pretty fast moving field even for an experienced bioinformatician and figuring out good ways of approaching your data takes a bit of work and you need to address and lay out a plan before you send anything out for sequencing. Good sequencing centres have tremendous experience in experimental design and often provide good "first pass" bioinformatics support. I've done a lot of work with Genome Quebec's Innovation Centre in Montreal (University of McGill) and they are fantastic. As part of my PhD I worked in a lab doing lots of de novo genome sequencing of microbial eukaryotes and the cheaper providers may be more trouble than they are worth, and you will probably not get any bioinformatics support from them

ADD REPLYlink written 6.9 years ago by Dan Gaston7.1k
gravatar for lh3
6.9 years ago by
United States
lh331k wrote:

It depends on the vendors. Illumina gives you a BAM aligned by Eland2 and variant calls by CASAVA. You do not get raw reads in fastq. I forget the format of variants. They did not use the reference genome we prefer, so in the end we have to convert BAM to fastq and redo the alignment by ourselves (we would much prefer fastq for this reason). If I am right, alignment and variant calling do not add extra cost in additional to sequencing.

BGI gives you alignment using an aligner and a reference genome on your request - at least they can run bwa and/or soap2 for you. They did not give me variant calls, but I guess you can request. BGI did not ask us to pay more for the alignment at that time (things may be changed, you know). I do not know if variant calling adds cost.

Yes, sending samples in batch helps to reduce the price.

ADD COMMENTlink written 6.9 years ago by lh331k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2354 users visited in the last hour