Question

Tool:SEDA (SEquence DAtaset builder): a desktop tool for processing FASTA files containing DNA and protein sequences

6

Entering edit mode

4.4 years ago

Hugo ▴ 380

Dear community members,

We present SEDA, an open source application for processing FASTA files containing DNA and protein sequences. The source code is available at GitHub and a complete user manual is available here.

Among other operations, SEDA allow to filter sequences based on different criteria (including text patterns), translate nucleic acid sequences into amino acid sequences, execute BLAST analyses, remove duplicated sequences and isoforms, and sort, merge, split or reformat FASTA files. It has been succesfully used to support the workflows provided in the following publication: Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences.

The operations are grouped in six categories: Alignment-related, BLAST, Filtering, Gene annotation, General and Reformatting. Below is the complete list of operations of SEDA 1.0 in each category:

Alignment-related
- Clustal Omega Alignment
- Concatenate sequences
- Consensus sequence
- Undo alignment
BLAST
- Blast
- Blast: two-way ortholog identification
Filtering
- Base presence filtering
- Filtering
- Pattern filtering
- Remove isoforms
- Remove redundant sequences
Gene Annotation
- Augustus (SAPP)
- getorf (EMBOSS)
- ProSplign/ProCompart Pipeline
- Splign/Compart Pipeline
General
- Compare
- Grow sequences
- Merge
- Regular expression split
- Split
- Translate
Reformatting
- Disambiguate sequence names
- NCBI rename
- Reallocate reference sequences
- Reformat file
- Rename header
- Sort

In case you encounter any bug or you want to ask for new operations or features, please feel free to open an issue at the GitHub repository.

fasta sequences java • 2.3k views

ADD COMMENT • link 5 months ago by Hugo ▴ 380

score 1 · Answer 1 · 2020-04-16

Dear community members,

We have just released SEDA 1.1.0, a new version of SEDA that includes new operations (reverse complement, trim alignment and remove stop codons), support for gzip compressed files, and a more efficient processing of FASTA files. This version also allows saving the configuration of the operations in order to re-apply them later or share it with colleagues.

In case you encounter any bug or you want to ask for new operations or features, please feel free to open an issue at the GitHub repository.

score 1 · Answer 2 · 2020-09-07

Dear community members,

We have just released SEDA 1.2.0, a new version of SEDA that includes three new operations and reduces the execution time of the BLAST: two-way ortholog identification operation. The new operations are: (i) NCBI BLAST and UniProt BLAST to run BLAST queries using the NCBI and UniProt web servers respectively; and (i) PfamScan to search and annotate sequences against the Pfam-A HMM library using the EMBL-EBI web service.

In addition, we have updated the user manual to include step-by-step execution guides of three protocols.

In case you encounter any bug or you want to ask for new operations or features, please feel free to open an issue at the GitHub repository.

score 1 · Answer 3 · 2020-11-27

Dear community members,

We have just released SEDA 1.3.0, a new version of SEDA that fixes and improves some operations. In addition, a paper about SEDA has just been accepted for publication and it is already available online: SEDA: a Desktop Tool Suite for FASTA Files Processing.

In case you encounter any bug or you want to ask for new operations or features, please feel free to open an issue at the GitHub repository.

score 0 · Answer 4 · 2023-11-23

Dear community members,

More than five years after the initial SEDA release, we have just released SEDA 1.6.0, a new version that includes many improvements, specially a new Command-Line Interface (CLI) for all operations. This way, SEDA can be used in scripts or pipelines!

This new version also:

Includes new distributions for APT (Advanced Package Tool), used in Debian-based distributions like Ubuntu or Kubuntu, RPM (Red Hat Package Manager), used in Fedora or CentOS, among others, and Snap.
Fixes a bug when filtering by reference sequence size difference from an external file.

In addition, the SEDA pipelines with Compi project provides a framework for easily creating pipelines made up of SEDA commands using Compi.

In case you encounter any bug or you want to ask for new operations or features, please feel free to open an issue at the GitHub repository.