Question: Reverse Complement Of A Sequence Raises Assertionerror: Invalid Alphabet Found
0
gravatar for Jelena_bioinf
7.4 years ago by
Zurich, Switzerland
Jelena_bioinf40 wrote:

I read a fasta-formatted genome sequence and try to get its reverse complement:

genomeSeq = FastaIO.FastaIterator(genomeHandle, IUPACUnambiguousDNA).next()
genomeSeq.seq.reverse_complement()

but it doesn't work and I can't understand why:

File "/Users/charodeika/Dropbox/genesGelfand/scripts/genome/src/matrixCount/sigma.py", line 127, in <module>
    print genomeSeq.seq[:10].reverse_complement()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/Seq.py", line 804, in reverse_complement
    return self.complement()[::-1]
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/Seq.py", line 752, in complement
    base = Alphabet._get_base_alphabet(self.alphabet)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/Alphabet/__init__.py", line 213, in _get_base_alphabet
    "Invalid alphabet found, %s" % repr(a)
AssertionError: Invalid alphabet found, <class 'Bio.Alphabet.IUPAC.IUPACUnambiguousDNA'>

And, of course, I would like to find out how to make it work!

biopython • 2.5k views
ADD COMMENTlink modified 6.3 years ago • written 7.4 years ago by Jelena_bioinf40
0
gravatar for Niek De Klein
7.4 years ago by
Niek De Klein2.5k
Netherlands
Niek De Klein2.5k wrote:

What I think the problem is, that genome sequences contain N's for nucleotides of which they are not sure what nucleotide it is. IUPACUnambiguousDNA only accepts ACTG. So you would want to change your script to ignore the N's (or any other non-dna letter there is in there.

ADD COMMENTlink modified 7.4 years ago • written 7.4 years ago by Niek De Klein2.5k

No, certainly not. It doesn't work for slices, that do not contain any other letters as A, T, G,C

ADD REPLYlink written 7.4 years ago by Jelena_bioinf40
1

Agreed - the alphabet letters are not relevant to this error.

ADD REPLYlink written 7.4 years ago by Peter5.8k
0
gravatar for Jelena_bioinf
7.4 years ago by
Zurich, Switzerland
Jelena_bioinf40 wrote:

Well, when I added

from Bio.Alphabet import IUPAC

and changed

genomeSeq.seq.reverse_complement()

to

genomeSeq = FastaIO.FastaIterator(genomeHandle, IUPAC.unambiguous_dna).next()

it started working nicely. However, I would still be happy to learn why.

ADD COMMENTlink written 7.4 years ago by Jelena_bioinf40
1

Without seeing the complete code (the imports) I can't be 100% sure, but I believe your error is passing an alphabet class rather than an instance of the class (an alphabet object).

ADD REPLYlink written 7.4 years ago by Peter5.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2285 users visited in the last hour