Hi everyone,
I have two files a fasta file and a txt file containing a list of sequence ID.
I would like to exclude the list of sequence ID ( text file) from fasta file. I have tried this command :
seqtk subseq input.fasta list_ids.txt > output.fasta
But it gives me an output with a fasta file containing only the list ofID sequences . I want a output ( fasta) without the sequence ID. if you could explain any answers in detail, I would be highly grateful
This question has been asked a gazillon times on biostars.org . What did you find so far ?
Pierre, i saw similiar questions.. but most of them are about a " different output".. i want a output without the list of ID... I saw a command line in seqtk, pyton, and someothers, but none of them worked for what i want. do u have another alternative ?
Using seqtk and unix tools:
Or in one line:
seqtk subseq input.fasta $(grep ">" input.fasta | tr -d ">" | grep -v -w -f list_ids.txt) > output.fasta