Synopsis: High throughput sequencing (HT-Seq or HTS), also known as next generation sequencing (NGS), presents a wide spectrum of opportunities for genome research. Unfortunately, many existing bioinformatic tools do not scale well to large datasets consisting of tens of millions of sequences generated by technologies like Illumina/Solexa, Roche/454, ABI/SOLiD and Helicos. The Bioconductor project fills this gap by providing a rapidly growing suite of well designed R packages for analyzing traditional and HT-Seq datasets. These 'BioC-Seq' packages allow to analyze these sequences with impressive speed performance. Their accelerations are achieved by using memory efficient string containers and performing the time consuming computations with calls to external programs that are implemented in compiled languages (e.g. C/C++). Together these packages form a novel framework that allows researchers to develop efficient pipelines by performing complex data analysis in a high level data analysis and programming environment.
Tutorial: High Throughput Sequence Analysis With R And Bioconductor
8.2 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:
ADD COMMENT • link •
8.2 years ago by
Leonor Palmeira ♦ 3.7k
Leonor Palmeira ♦ 3.7k wrote:
And I would add the functionalities of seqinR (I am one of the developers but it still is a great tool :-)) :
- to query major databases from within R
- to manipulate fasta files and alignments files
- to compute all sorts of statistics on sequences
See here for the manual (which I also find quite entertaining for a software manual!).
ADD COMMENT • link
Please log in to add an answer.
Powered by Biostar version 2.3.0
Traffic: 821 users visited in the last hour