Entering edit mode
12.1 years ago
Jeremy Leipzig
22k
I'm looking for some implementations of the following pattern (in any language):
- One Sanger-sized query sequence is pairwise aligned against a short (<100kbp) reference with some BED/GFF gene calls.
- Alignment is parsed - sequence substitutions and indels in the non-genic areas are reported as is.
- Mutations in the genic sections are examined for syn/non-syn substitutions, frameshifts.
I imagine this would be a fairly common script so I'd like to see some existing examples and approaches before reinventing the wheel. "Use Bioperl" is not an acceptable answer.