Question: How Complex It Is To Analyze Ngs Data ?
gravatar for geek_y
6.4 years ago by
geek_y10k wrote:


I would like to know how complex it is to analyze NGS Data. Is it possible to learn NGS data analysis from the online resources or should we learn under the guidance of an expert ? How to get the core concepts of NGS data analysis ? How to configure parameters while using open source tools ? ( Assembly, Alignment, statistics etc ). I have masters degree in bioinformatics with unix, perl and basic core Java skills. Any advice is appreciated.

ngs bioinformatics • 4.2k views
ADD COMMENTlink modified 3.4 years ago • written 6.4 years ago by geek_y10k

This is a complex question. It depends on many factors: technology, experimental design, computer resources, organism, ... In some cases is really straight forward, in others is a pain in the b**. You can learn in both ways, using online resources and with expert advice, nothing guarantees making you an expert ;)

ADD REPLYlink written 6.4 years ago by JC9.3k

Woah, Biostars' bot must have just put this to the top of the front page, and I was thinking "But Goutham can analyze bioinformatic data. Matter o' fact I thought he was pretty good at it. Why is he asking this?"

You've clearly learned a hell of a lot in the last few years man. Congratulations to you! :)

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by John12k

Thanks for the appreciation. Its all about passion to learn something that really interests you. People on Biostars definitely helped a lot.

I asked this question when I was in dilemma to leave a good paying corporate job that I am not really interested Vs. to go to a research group that does a lot of genomics but with low pay. It was a risky decision for me. And I have moved to the research institute and now happy to be a Marie Curie fellow.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by geek_y10k

Dude, maybe accept both answers so the bot stops bumping the post? Nostalgia is great, but I guess we need to give the bot a sense of closure.

ADD REPLYlink written 3.4 years ago by RamRS25k

Exact same thought in my head. @Geek_y has grown A LOT! I'm so happy and proud!

ADD REPLYlink written 3.4 years ago by RamRS25k
gravatar for Pierre Lindenbaum
6.4 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum125k wrote:
  • download bwa, samtools and a reference genome.
  • generate a random set of reads using samtools/misc/wgsim
  • index the genome
  • align the reads and generate a sam output.
  • describe each column of the sam
  • generate a vcf from the sam using samtools/mpileup
  • describe each column of the vcf
  • use ensembl/vep to predict the consequences of the variations.
ADD COMMENTlink written 6.4 years ago by Pierre Lindenbaum125k

This is the super-simplified version....

ADD REPLYlink written 6.4 years ago by Sean Davis25k
gravatar for Alex Paciorkowski
6.4 years ago by
Rochester, NY USA
Alex Paciorkowski3.4k wrote:

If you have a masters degree in bioinformatics with unix, perl and core Java skills, you can do this. How to get the core concepts? Like with anything else, read, go to talks, ask questions. There are many good sources of information here (search is your friend) and elsewhere online. I would recommend spending at least some time with someone who has worked with these data types, be it RNASeq or DNA, for real projects. There is still enough art and craft in this corner of science that learning some of the ropes from a mentor will save you down the road. Also, I can't emphasize enough working on projects with sound experimental design, and where NGS is applied appropriately. I see projects that never really go anywhere basically for these reasons, the experimental hypotheses were under-formulated or really a stretch, the experiment was underpowered, or the sequencing approach used was not going to give you an answer (single end reads, when paired end should have been done). Some of these things will be out of your control, some will be up to luck. But they can all cause problems for your analysis, and lead to the impression that the analysis of these types of data is "hard". On the other hand, there are times when the experimental design is sharp, the capture and sequencing go without a hitch, analysis hits no bumps in the road -- and as JC says above, it's as straight forward as it can get. Also, I think it's important to get hands-on experience working at every stage of the analysis pipeline, from initial qc, cleanup, trimming etc, all the way down to dealing with the called variants and annotation. Enjoy!

ADD COMMENTlink written 6.4 years ago by Alex Paciorkowski3.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2216 users visited in the last hour