Bioconductor/R For Ngs Analysis
3
1
Entering edit mode
12.8 years ago
Travis ★ 2.8k

R is something I have never used but have always been aware of. I have seen it mentioned numerous times for the analysis of next gen sequencing data, but I always thought of it as a 'stats' language.

My question is where does it fit in amongst the plethora of NGS analysis programs out there? I mean things like Bowtie, Tophat, Cufflinks, GATK, SAMTools etc... Is it something that is commonly or rarely used? Who uses it or doesn't, and why?

Thanks in advance for educating me.

next-gen sequencing bioconductor r • 6.6k views
ADD COMMENT
9
Entering edit mode
12.8 years ago

This is a good question, but not one that is easy to answer. I suggest you do some reading and some experimentation. Decide what question(s) you need to answer and then work from there. There is not really a "one-size-fits-all" approach to next-gen data analysis, unfortunately.

If you have never used R and are working in the broad area of bioinformatics, I would say that it is a "must be aware of" kind of tool.

ADD COMMENT
7
Entering edit mode
12.8 years ago

First and foremost you will need to distinguish between data producing and data analysis tools. Bowtie, Tophat etc are tools that generate data by combining several other data sources. Their outputs are usually large files that need further processing.

R is a generic data analysis platform that is best suited at making sense (summarizing and visualizing) the information contained in datasets produced by the tools mentioned before. Its output are usually tables, plots and statistical results that have a direct biological interpretation.

Of course there may be some overlap between the two concepts, though I think the distinction above is a good generic description of the differences between these tools.

ADD COMMENT
0
Entering edit mode

I think it is the level of overlap that has perhaps confused me!

ADD REPLY
3
Entering edit mode
12.8 years ago

I would go one step further in the discrimination than Istvan did, and discriminate tools for data production, data analysis and biological interpretation. Where R/Bioconductor primarily serve for data analysis. It is true that R/Bioconductor packages can also be used for biological interpretation since there are packages for gene annotation, identifier resolution and for instance gene set enrichment or gene ontology analysis. Which makes sense since that biological interpretation also involves a lot of statistics. But for biological interpretation and data integration especially with visualization support you would probably prefer dedicated tools like our PathVisio (there are many other pathway tools, but I like this one) and Cytoscape.

ADD COMMENT

Login before adding your answer.

Traffic: 1651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6