Question: What are the various procedures involved in DNA Mapping ?
0
gravatar for plabanbiswas96
2.6 years ago by
plabanbiswas9610 wrote:

Hello,

I am new to the field of BioInformatics, I have selected DNA Mapping using hadoop as my project and I will using java for this project (maven for build). I need to know what are the steps for DNA Mapping (like which file formats are required for input, how to map it to reference genome and what is the output file) and what are the available libraries for my project (I have come across samtools and hadoop-bam) .

I have come across htsjdk-SAM, FASTQ, BAM etc but can't figure out what to use as input for mapping and how to map it.

Thank You,

sam fastq sequencing java genome • 756 views
ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by plabanbiswas9610

I would begin reading this How to map billions of short reads onto genomes. And then I would google how to use a given aligner. I recommend you to use bwa-mem. Here is an overview of the commands bwa. The input to an aligner is a fastq file, and the output is sam/bam file.

ADD REPLYlink written 2.6 years ago by IP590

So, as far my understanding goes DNA sequence is present in FASTQ format, then it is mapped against a Reference genome which creates a SAM file and the SAM file has coordinates of mapped DNA. Right?

ADD REPLYlink written 2.6 years ago by plabanbiswas9610

exactly. The sam format has a lot of information about the alignment like mapping quality...

ADD REPLYlink written 2.6 years ago by IP590

Thanks for the information and swift reply :-)

ADD REPLYlink written 2.6 years ago by plabanbiswas9610
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1182 users visited in the last hour