Question: Is there a code to find consensus motif
0
gravatar for vinayjrao
14 months ago by
vinayjrao140
JNCASR, India
vinayjrao140 wrote:

Hello,

I've been trying to write a code to find a consensus motif in a given sequence, and for this purpose, I was only able to reach till finding a substring in a string. I want to be able to allot multiple nucleotides/amino acids at each position, and also enter N/X representing any of the nucleotides/amino acids. I would very much appreciate any help.

Thanks.

P.S. The post tags represent the languages I'm comfortable understanding.

Edit: Example of the consensus motif - A/T A A G C A A/T/G N N A

Sequence - CGATCGTG TAAGCAGCTA GTCATG

Bolded sequence is the consensus

awk shell C python • 600 views
ADD COMMENTlink modified 14 months ago by Carlo Yague4.7k • written 14 months ago by vinayjrao140
1
gravatar for Carlo Yague
14 months ago by
Carlo Yague4.7k
Belgium
Carlo Yague4.7k wrote:

In shell using grep and regular expressions:

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep  -o "[AT]AAGCA[ATG]..A"
TAAGCAGCTA

'N' is expressed as '.', meaning that it can take any value. Multiple nucleotides at one position are put into square brackets.

ADD COMMENTlink written 14 months ago by Carlo Yague4.7k

Thanks a lot. It's perfect.

ADD REPLYlink modified 14 months ago • written 14 months ago by vinayjrao140
1

In the same lines of Carlo Yague

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep -Po \([AT]\)A{2}GCA[\1G].{2}A
TAAGCAGCTA
ADD REPLYlink modified 14 months ago • written 14 months ago by cpad011212k

Thanks. This works too. I could use the .{2} when I have larger repeats of any nucleotide/amino acid. Although, I would like to know why it [\1G] and not [ATG]?

ADD REPLYlink modified 14 months ago • written 14 months ago by vinayjrao140

The first AT is made a group and every time and anywhere you can call it by its serial number (1 here)

ADD REPLYlink written 14 months ago by cpad011212k

That's an extremely handy option. Thanks again :)

ADD REPLYlink written 14 months ago by vinayjrao140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1677 users visited in the last hour