I add a new solution using biopython able to remove the sequence with identifiers in a given file. I consider Biopython more reliable and reproducible than grep/sed/awk when working with a great number of sequences.
As I do not know your experience with python, I will explain the solution in detail:
1) Ensure you have python and biopython installed. Type in your terminal:
python -c "import Bio"
echo $?
If "0" displayed, pass to step 2. In case error message appears, install it by easy_install or pip:
sudo easy_install -f http://biopython.org/DIST/ biopython
or
sudo pip install biopython
In case of not having pip installed, type:
sudo apt-get install pip
2) Create a script ".py" with this script ( filter.py, for example ) using your favourite text editor:
#!/usr/bin/env python
import sys
from sets import Set
from Bio import SeqIO
def list_ids():
    """
    Return a set containing the identifiers presented in a file,
    line by line, starting with ">"
    """
    # read the first file given and generate a set (faster iteration respect lists
    identifiers = Set([])
    with open(sys.argv[1], 'r') as fi:
        for line in fi:
            line = line.strip()
            identifiers.add(str(line).replace(">", ""))
    return identifiers
def filter():
    """
    Writes a file containing only the sequences with identifiers NOT
    present in a set of identifiers
    """
    identifiers = list_ids()
    with open(sys.argv[2]) as original_fasta, open(sys.argv[3], 'w') as corrected_fasta:
        records = SeqIO.parse(original_fasta, 'fasta')
        for record in records:
            print record.id
            if record.id not in identifiers:
                SeqIO.write(record, corrected_fasta, 'fasta')
if __name__ == '__main__':
    filter()
3) Run the program as follows:
chmod +x filter.py
./filter.py your_ids.txt IN.fasta OUT.fasta
where:
your_ids.txt --> A file containing the identifiers including > you want to EXCLUDE, one identifier per line; like:
>hhh 
>ghag 
>MAMND
 
IN.fasta --> your original fasta file
 
OUT.fasta --> your output file 
                    
                
                 
Thank you very much for your help. The script works well.
You're welcome, happy to help!
Thank you very much. It totally worked :)
Happy to hear that, you're welcome!
Will this script still work with python3 ? How can I modify it so it will ?
I had the same problem, I tried several alternatives in the windows subsystem for Linux, until I found your script and it has been very helpful. Thank you samuelmiver
Hi samuelmiver,
When I run
filter.py, it says