Removing sequences that contain deletion mutations in them from a fasta file (with python)

Deleted:Removing sequences that contain deletion mutations in them from a fasta file (with python)

0

Entering edit mode

2.1 years ago

M. ▴ 30

Hi everyone. I have a multiple FASTA file which contains roughly 150k sequence. The first sequence is my reference sequence and the rest of it contains sequences that have some mutations in them.

I'm working with single-point mutations and I am going to find them by comparing each sequence with the reference. So I need to remove the ones that have deletions because they would act like single-point mutations. Does anyone have any idea how to do that?

The deletion mutations are like this:

enter image description here

There is a deletion in the selected area hence the whole sequence shifted. There are many mutations like that. I don't have any idea how to detect and remove them.

I don't know how to align that much sequence but if I could that would solve my problem too.

Thanks in advance for your opinions and helps.

python • 2.1k views

ADD COMMENT • link 2.1 years ago by M. ▴ 30

This thread is not open. No new answers may be added