Entering edit mode
3.0 years ago
Hello!
I have a de novo built contig file (contigs.fasta), assembly is done in SPAdes. From this file, I need to extract only the node sequences which have GC content below a certain % (e.g. extract all the node sequences that have GC content < 35%).
Do you have any suggestions of how I can do this? I am currently using seqkit to show me the GC content % of each node:
seqkit fx2tab --name --only-id --gc contigs.fasta > results.txt
The problem is, this way I can only see GC% of each node, and cannot do any "extraction" of the actual node sequences I need.
Thank you very much in advance!