How to remove part of FASTA sequences?
Entering edit mode
6.9 years ago
bioinfo ▴ 790

I have 10 bacterial genomes FASTA sequences (few of them might have chromosome and plasmids together and rest only chromosome) . And I have another FASTA file with 20 different plasmids. My plan is to remove the plasmid part from those 10 bacterial genomes and keep only chromosomal part as FASTA form if any section of bacterial genome match 100% to any of my 20 plasmids list. What are the best ways to do it? 


sequence FASTA • 2.5k views
Entering edit mode
6.9 years ago

I would take a look at the Biostrings R package for this. You can read your FASTA sequences in, then simply use the `maskMotif` method to mask your plasmid sequences and then you can use `injectHardMask` to replace your masked sequences with a letter of your choice. This way you can still identify where there were sequences, and the total length of your chromosome doesn't change.


Login before adding your answer.

Traffic: 2524 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6