Efficient read mapping and alignment to reference set
0
0
Entering edit mode
4.0 years ago

Hi! I have an alignment/mapping question for the following dataset. I have ~1 million different reference sequences with an average of 110 nucleotides, and an actual sequencing of 50 million reads, where one read completely covers a reference sequence (the theoretical Illumina read length is longer than the longest reference possible) . I'd like to uniquely match every sequencing read to a reference sequence. Of course we might have sequencing error, or experimental errors coming from steps before the libprep/sequencing, so a given read might contain mismatches, indels and match multiple references.

I was thinking about creating a bwa index with 1 million artificial chromosomes, run bwa and process the resulting alignments. Is there anything better/faster/easier to parse?

This and this question is somewhat similar, but not really the same.

alignment sequencing • 438 views
ADD COMMENT

Login before adding your answer.

Traffic: 2269 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6