Remove stop codon containing sequences from multifasta(DNA) files
1
0
Entering edit mode
2.3 years ago
MSRS ▴ 550

My input multiple sequence alignments contain stop codon in the middle (TAG", "TGA", "TAA) in few sequences. I want to remove all the sequences containing a stop codon in the middle but ignoring the last stop codon. Is there any biopython or other program available? (I am Ubuntu user).

My input

>A
ATGGCAGCAGATTCCAACTAA
>B
ATGGCTAAAGATTCCAACTAA
>C
ATGGCATAAGATTCCAACTAA

Output might be

>A
ATGGCAGCAGATTCCAACTAA

Thanks in Advance

alignment • 1.1k views
ADD COMMENT
3
Entering edit mode
2.3 years ago
Hugo ▴ 360

Dear Shaminur, you can use SEDA ( https://www.sing-group.org/seda/ ). To do what you want, you should use the Filtering operation (https://www.sing-group.org/seda/manual/operations.html#id5) and check the option 'Remove sequences with in-frame stop codons'.

If you have any question, do not hesitate ask me.

Regards.

ADD COMMENT
1
Entering edit mode

Thank you so much, this is very perfect one🙂

ADD REPLY
2
Entering edit mode

Please accept the answer if its "solved".

ADD REPLY

Login before adding your answer.

Traffic: 1253 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6