sequence alignment using degenerated nucleotide codes as reference sequence
1
0
Entering edit mode
6.5 years ago
ycding • 0

We sequenced human MHC or HLA regions in chromosome 6 for a few hundred human samples. MHC regions show extremely high sequence diversity, there are more than 10 SNPs in one 100 bp read, so the regular sequence alignment tools such as BWA could not handle it due to too many mismatches in one read. However, MHC regions are well sequenced by traditional Sanger sequencing technology, so majority of SNPs or deletions or insertions are identified. Therefore, if a alignment tool can use degenerated nucleotide code [ instead of A T C G nucleotide only, R(A or G) Y(C or T) K (G or T) M (A or C) N (A T C G) codes are included in reference sequence in addition to A T C G], then MHC regions can be aligned well BY allowing additional two mismatches. Do you know an alignment tool that can adopt degenerated nucleotide codes as reference sequence?

thank you, Ding

alignment • 1.5k views
ADD COMMENT
0
Entering edit mode
6.5 years ago

You can just use hisat2. It won't accept degenerate bases, but you can add those variants into the graph.

ADD COMMENT

Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6