My aim was to retrieve genes containing a consensus motif from a file containing genes in fasta format. I solved that with the help of Seqkit tool. For ex,
If my motif is something like this, I can write,
seqkit grep -srip 'G[TA][ATC]AGCA[TAC]' input.fasta > output1.fasta
Some of the motifs that I have contains several ambiguous bases and I am not sure the matching region. So my question here,
- How can I get the target matching sequence of consensus motif?
What is the function of -srip in the above command? Because when I use
grep -o "G[TA][ATC]AGCA[TAC]" input.fasta > output2.fasta
Not all fasta files in output1.fasta have corresponding motif from output2.fasta