Hi all, I am preparing a course on NGS: there will be seven students for 4 hours an I want them to play with some NGS data. No programming skill is required here.
Here is what I plan to do:
- short intro to NGS
- structure of a (small) FASTQ file
- map it with BWA on public Galaxy http://main.g2.bx.psu.edu/
- index the genome and map the fastqs with MAQ
- index the genome and map the fastqs with BWA
- structure of a SAM file
- GATK recalibration (?)
- call the SNPs with samtools pileup and generate a VCF
- explore a BAM file with samtools tview
- find the rs## at UCSC (table browser or mysql )
- predict the consequences of a set of SNPs with polyphen2 (btw is there a way to generate a random fastq file with a set of 'forced' mutations ?)
my other ideas:
- running something in the cloud: do you know if there is a way to run something for free on Amazon ? what kind of analysis could I run ?
- storing something (the VCF ?) in a database (mysql ? sqlite3 ?) and using rails to display the data
- generating the tool for using a webservice (SOAP/REST...): what service could I use for this course ?
Any other suggestion ? What would you like to see during this course ?
I'll validate the answer with highest number of votes next week.
EDIT: the course should give them the opportunity see what would look the work of someone working with NGS and to have an experience with some real data. I don't know their skill but AFAIK, there are supposed to have some programming courses later.
My only experience is the analysis of "exome capture" data = SNP.
Update: I posted my slides on slideshare: http://www.slideshare.net/lindenb/20101210-ngscourse