Question

Best Way To Map Sra Data To 100Kb Dna Fragment

2

Entering edit mode

13.4 years ago

Alex ★ 1.5k

I have several 100kb genome regions and several public avaliable SRA experiments. I want to check if any of SRA reads can be mapped to given genome regions.

What is the best way to do it?

short alignment mapping • 2.8k views

ADD COMMENT • link updated 13.4 years ago by Daniel Standage 4.1k • written 13.4 years ago by Alex ★ 1.5k

score 2 · Answer 1 · 2010-11-10

I would say there is no "best way" to do this - this stuff is all still relatively new, so tools/techniques are still evolving.

At a high level, you need to use a short read aligner to map your reads (in your case, from SRA) to a reference (which can contain multiple sequences - in this case, your 100kb genome regions).

In more detail:

Choose an aligner that suits your needs. Many exist, each with pros/cons. Bowtie and BWA are popular examples. Ideally, you should become familiar with your reads before choosing an aligner, since different aligners are optimised for different things. For example, you should know the sequencing platform used to generate your reads (e.g. Illumina), the lengths of your reads (e.g. small 40bp, or large 200bp), and whether your reads are single-end or paired-end.
Build your reference into some kind of index that your chosen aligner can support. Typically, each aligner will also come with a tool for doing this. These tools generally expect your reference sequences (in this case, your genome regions) to be in FASTA format. Read the documentation of your chosen aligner for more details.
Run your aligner on the SRA data, using the index you just built. As far as I am aware, SRA provides mechanisms for you to acquire data in a number of different formats. The FASTQ format is probably the most widely supported format for aligners. Aligners tend to have a lot of parameters - again, you should read the documentation to help you understand them.
The output of the aligner depends on the aligner you choose. Many aligners output in the SAM format these days. Many tools exist for exploring the output (google might help here), or you can become familiar with the format from the documentation, and explore the results yourself with command line tools / scripts.

Good luck!

Ram · Answer 2 · 2010-11-10

2

Entering edit mode

13.4 years ago

Daniel Standage 4.1k

Pierre asked a question a few days ago about using short read aligners for SNP calling. Even if you don't want to do SNP calling, it provides a good overview of what's out there (check out the wikipedia link).

ADD COMMENT • link updated 4.7 years ago by Ram 43k • written 13.4 years ago by Daniel Standage 4.1k