Deleted:Removing sequences that contain deletion mutations in them from a fasta file (with python)
0
0
Entering edit mode
21 months ago
M. ▴ 30

Hi everyone. I have a multiple FASTA file which contains roughly 150k sequence. The first sequence is my reference sequence and the rest of it contains sequences that have some mutations in them.

I'm working with single-point mutations and I am going to find them by comparing each sequence with the reference. So I need to remove the ones that have deletions because they would act like single-point mutations. Does anyone have any idea how to do that?

The deletion mutations are like this:

enter image description here

There is a deletion in the selected area hence the whole sequence shifted. There are many mutations like that. I don't have any idea how to detect and remove them.

I don't know how to align that much sequence but if I could that would solve my problem too.

Thanks in advance for your opinions and helps.

python • 1.9k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6