Hello, all! I want to delete 16s sequence from bacteria genome in either gbk or fasta format. I dob't have any access to commercial software. Are there any open source programs or command line tools that can achieve this? Thanks!
Best, Ben
Hello, all! I want to delete 16s sequence from bacteria genome in either gbk or fasta format. I dob't have any access to commercial software. Are there any open source programs or command line tools that can achieve this? Thanks!
Best, Ben
Using the (free) BBMap package:
If you know the 16S sequences (or those of a close relative), you can align them to the genome to produce a sam file. For example:
mapPacBio.sh ref=genome.fasta in=16S.fasta out=mapped.sam ambig=all maxindel=20
Then you can run BBMask:
bbmask.sh in=genome.fasta out=masked.fasta sam=mapped.sam masklowentropy=false maskrepeats=false
This will mask the sequences covered by the mapped 16S sequences. You can alternatively do it using BBDuk's kmer-based masking mode:
bbduk.sh in=genome.fasta out=masked.fasta ref=16S.fasta k=200 hdist=1 kmask=N
...but I'd suggest the alignment method unless you encounter problems with it.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You could open your fasta file and locate the 16S sequence start (by find) and then delete the range you need with any text editor on *nix/OS X. If you are working on windows then try ApE or SnapGene Viewer.