Question: Best Way To Map Sra Data To 100Kb Dna Fragment
2
gravatar for Alex
10.1 years ago by
Alex1.5k
Theodosius Dobzhansky Center for Genome Bioinformatics
Alex1.5k wrote:

I have several 100kb genome regions and several public avaliable SRA experiments. I want to check if any of SRA reads can be mapped to given genome regions.

What is the best way to do it?

short alignment mapping • 2.0k views
ADD COMMENTlink written 10.1 years ago by Alex1.5k
2
gravatar for Bio_X2Y
10.1 years ago by
Bio_X2Y3.9k
Ireland
Bio_X2Y3.9k wrote:

I would say there is no "best way" to do this - this stuff is all still relatively new, so tools/techniques are still evolving.

At a high level, you need to use a short read aligner to map your reads (in your case, from SRA) to a reference (which can contain multiple sequences - in this case, your 100kb genome regions).

In more detail:

  • Choose an aligner that suits your needs. Many exist, each with pros/cons. Bowtie and BWA are popular examples. Ideally, you should become familiar with your reads before choosing an aligner, since different aligners are optimised for different things. For example, you should know the sequencing platform used to generate your reads (e.g. Illumina), the lengths of your reads (e.g. small 40bp, or large 200bp), and whether your reads are single-end or paired-end.

  • Build your reference into some kind of index that your chosen aligner can support. Typically, each aligner will also come with a tool for doing this. These tools generally expect your reference sequences (in this case, your genome regions) to be in FASTA format. Read the documentation of your chosen aligner for more details.

  • Run your aligner on the SRA data, using the index you just built. As far as I am aware, SRA provides mechanisms for you to acquire data in a number of different formats. The FASTQ format is probably the most widely supported format for aligners. Aligners tend to have a lot of parameters - again, you should read the documentation to help you understand them.

  • The output of the aligner depends on the aligner you choose. Many aligners output in the SAM format these days. Many tools exist for exploring the output (google might help here), or you can become familiar with the format from the documentation, and explore the results yourself with command line tools / scripts.

Good luck!

ADD COMMENTlink modified 10.1 years ago • written 10.1 years ago by Bio_X2Y3.9k
2
gravatar for Daniel Standage
10.1 years ago by
Daniel Standage4.0k
Davis, California, USA
Daniel Standage4.0k wrote:

Pierre asked a question a few days ago about using short read aligners for SNP calling. Even if you don't want to do SNP calling, it provides a good overview of what's out there (check out the wikipedia link).

ADD COMMENTlink modified 15 months ago by _r_am31k • written 10.1 years ago by Daniel Standage4.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1870 users visited in the last hour