7.1 years ago by
Santiago de Compostela, Spain
I find Zev's comment very important, as it's not that rare to find people coming to you with NGS data, eyes wide open, even sweating by the challenge they're facing, and asking: "now, what?". the high-throughput genotyping field has already defined some very interesting approaches for extracting association and linkage knowledge from such amount of data, but the one of the most interesting strengths of NGS may the variant discovery capability, that allows us to work with really rare variants but in very large numbers. first you'll have to think what question do you want to ask to your data, and then you'll have to find out how to build that question on a computer. in fact, the question should have been defined before deciding going into NGS, but that'd be another story.
If you are talking about how to deal with .gff3 variants (are we talking about SOLiD LifeScope's?) the best suggestion I can think of is to annotate them (with ANNOVAR for instance), which will allow you to deal with them later as tabulated files with enriched information. and if you want to get knowledge from all those (and the newly generated ones) tabulated files at once, instead of extracting the information sample by sample yes, you will definitely need to create a tool to process them. if it's just for simple operations like combining, merging, overlapping,... scripting would do. if you want to go beyond that, extracting conclusions from statistical inferences, then you'll certainly would have to think about dirtying your hands with R.