Question: Trimming/masking ambiguous codon positions from nucleotide sequences
0
gravatar for john.soghigian
6 months ago by
john.soghigian0 wrote:

I have a large number of fasta files each containing a single nucleotide sequence, all of which are in frame (but not all of which contain start codons), and some of which contain ambiguous characters (Ns) where the identity of a particular base is unknown due either to poor quality or absent sequence information.

At times, a particular codon may only have a single base that was properly sequenced, such as below, a short example sequence from one such file: ...GTGCTGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAG...

I would like to be able to trim out Ns from all of these sequences, but do so in a "codon-sensitive" way, such that the trimming would either leave the CNN or remove the C with the Ns (ideal). It is trivial for me to remove the Ns, but I am not sure how to handle it given the codons. If it is helpful, I already have the corresponding amino acid sequence.

ADD COMMENTlink written 6 months ago by john.soghigian0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1306 users visited in the last hour