Multiple motif [available in .txt file ] search from large fasta file to find motif frequency then print fasta sequences
0
0
Entering edit mode
3.0 years ago

Dear all,

I have a MOTIF.txt file contains MULTIPLE motifs (each line one motif, > 5000 MOTIFS)

for example

$ cat MOTIF.TXT
TCGFHAHH
GHHFDSJH

AND I HAVE A SEQUNCES.FASTA (> 10000 SEQUENCES) IN WHICH MOTIFS (MARKED BY BOLD AND ITALICS) MIGHT BE PRESENT AT ANY PLACE IN THE FASTA SEQUENCE

$ cat SEQUNCES.FASTA

>1
CCC***TCGFHAHH***
> 2
CC***TCGFHAH*H**GG
>3
TTT***GHHFDSJH***CC

NOW I WANT TO WRITE THE FREQUENCY OF MOTIF 1 (TCGFHAHH) THEN ALL THE FASTA SEQUNCES (INCLUDING HEADER) IN A NEW FILE. SIMILARLY FOR MOTIF2 IN THE SAME FILE AND SO ON

PLEASE HELP , SUGGEST HOW TO DO IT.

motif fasta frequency • 1.2k views
ADD COMMENT
0
Entering edit mode

No need to SHOUT.

ADD REPLY
0
Entering edit mode

Use seqkit grep and/or seqkit locate (LINK).

ADD REPLY
0
Entering edit mode

Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6