Question: Is there a code to find consensus motif
0
gravatar for vinayjrao
23 months ago by
vinayjrao170
JNCASR, India
vinayjrao170 wrote:

Hello,

I've been trying to write a code to find a consensus motif in a given sequence, and for this purpose, I was only able to reach till finding a substring in a string. I want to be able to allot multiple nucleotides/amino acids at each position, and also enter N/X representing any of the nucleotides/amino acids. I would very much appreciate any help.

Thanks.

P.S. The post tags represent the languages I'm comfortable understanding.

Edit: Example of the consensus motif - A/T A A G C A A/T/G N N A

Sequence - CGATCGTG TAAGCAGCTA GTCATG

Bolded sequence is the consensus

awk shell C python • 849 views
ADD COMMENTlink modified 23 months ago by Carlo Yague5.0k • written 23 months ago by vinayjrao170
1
gravatar for Carlo Yague
23 months ago by
Carlo Yague5.0k
Canada
Carlo Yague5.0k wrote:

In shell using grep and regular expressions:

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep  -o "[AT]AAGCA[ATG]..A"
TAAGCAGCTA

'N' is expressed as '.', meaning that it can take any value. Multiple nucleotides at one position are put into square brackets.

ADD COMMENTlink written 23 months ago by Carlo Yague5.0k

Thanks a lot. It's perfect.

ADD REPLYlink modified 23 months ago • written 23 months ago by vinayjrao170
1

In the same lines of Carlo Yague

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep -Po \([AT]\)A{2}GCA[\1G].{2}A
TAAGCAGCTA
ADD REPLYlink modified 23 months ago • written 23 months ago by cpad011213k

Thanks. This works too. I could use the .{2} when I have larger repeats of any nucleotide/amino acid. Although, I would like to know why it [\1G] and not [ATG]?

ADD REPLYlink modified 23 months ago • written 23 months ago by vinayjrao170

The first AT is made a group and every time and anywhere you can call it by its serial number (1 here)

ADD REPLYlink written 23 months ago by cpad011213k

That's an extremely handy option. Thanks again :)

ADD REPLYlink written 23 months ago by vinayjrao170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1816 users visited in the last hour