Question: Which Are The Typical Paper For The Hadoop Application In Genome Analysis?
gravatar for Fayue1015
8.0 years ago by
Fayue1015200 wrote:

I quite interested in this area since Hadoop are becoming more and more popular, I believe there are some excellent papers in this domain, can you recommend some? Thanks

genome analysis • 1.7k views
ADD COMMENTlink written 8.0 years ago by Fayue1015200

Have you done your own literature search yet?

ADD REPLYlink written 8.0 years ago by Neilfws49k

Yes, as far as I know that, I have written such paragraph, do you have some more to add? Considering the consistently dropping cost of sequencing technologies, it is anticipated that by mid 2013, we will enter an era of sequencing one genome at the cost of $1,000 or below1. At that time, we will need to analyze and inter- pret whole-genome data for personalized medicine. Currently, many preparations for genome analysis using big data technologies are on the way. Hadoop-BAM [Niemenmaa et al., 2012], specifically designed for sequence alignment of NGS data, provides a library for directly manipulating the aligned NGS data, which is stored in BAM file (Binary Alignment Map). Eoulsan [Jourdren et al., 2012] provides a cloud computation framework including analysis of high-throughput sequence data from upstream quality control to downstream differential expres- sion detection. CEO [Wang et al., 2010b] and eCEO [Wang et al., 2011] focus mainly on dividing the exponential combination of tests into the distributed com- puting tasks in the cloud. Wang et al. [2012] further extend this work by providing a general framework for combinatorial data analysis.

ADD REPLYlink written 8.0 years ago by Fayue1015200

Just checking because sometimes when people write "I believe there are some excellent papers", it suggests that they have not bothered to look at them :) It's good to indicate that you have done some research when asking for recommendations.

ADD REPLYlink written 8.0 years ago by Neilfws49k
gravatar for Sukhdeep Singh
8.0 years ago by
Sukhdeep Singh10k wrote:

Have you seen Crossbow mentioned on the post Analyzing Human Genomes with Hadoop

Crossbow is a scalable software pipeline for whole genome resequencing analysis. It combines Bowtie, an ultrafast and memory efficient short read aligner, and SoapSNP, and an accurate genotyper. These tools are combined in an automatic, parallel pipeline that runs in the cloud (Elastic MapReduce in this case) on a local Hadoop cluster, or on a single computer, exploiting multiple computers and CPUs wherever possible. The pipeline can analyze over 35x coverage of a human genome in one day on a 10-node local cluster, or in 3 hours for about $85 using a 40-node, 320-core cluster rented from Amazon Web Services.

ADD COMMENTlink written 8.0 years ago by Sukhdeep Singh10k

thanks, very useful

ADD REPLYlink written 8.0 years ago by Fayue1015200
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1156 users visited in the last hour