Question: Is there a code to find consensus motif
0
gravatar for vinayjrao
9 months ago by
vinayjrao110
JNCASR, India
vinayjrao110 wrote:

Hello,

I've been trying to write a code to find a consensus motif in a given sequence, and for this purpose, I was only able to reach till finding a substring in a string. I want to be able to allot multiple nucleotides/amino acids at each position, and also enter N/X representing any of the nucleotides/amino acids. I would very much appreciate any help.

Thanks.

P.S. The post tags represent the languages I'm comfortable understanding.

Edit: Example of the consensus motif - A/T A A G C A A/T/G N N A

Sequence - CGATCGTG TAAGCAGCTA GTCATG

Bolded sequence is the consensus

awk shell C python • 489 views
ADD COMMENTlink modified 9 months ago by Carlo Yague4.4k • written 9 months ago by vinayjrao110
1
gravatar for Carlo Yague
9 months ago by
Carlo Yague4.4k
Belgium
Carlo Yague4.4k wrote:

In shell using grep and regular expressions:

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep  -o "[AT]AAGCA[ATG]..A"
TAAGCAGCTA

'N' is expressed as '.', meaning that it can take any value. Multiple nucleotides at one position are put into square brackets.

ADD COMMENTlink written 9 months ago by Carlo Yague4.4k

Thanks a lot. It's perfect.

ADD REPLYlink modified 9 months ago • written 9 months ago by vinayjrao110
1

In the same lines of Carlo Yague

echo 'CGATCGTG TAAGCAGCTA GTCATG' | grep -Po \([AT]\)A{2}GCA[\1G].{2}A
TAAGCAGCTA
ADD REPLYlink modified 9 months ago • written 9 months ago by cpad011211k

Thanks. This works too. I could use the .{2} when I have larger repeats of any nucleotide/amino acid. Although, I would like to know why it [\1G] and not [ATG]?

ADD REPLYlink modified 9 months ago • written 9 months ago by vinayjrao110

The first AT is made a group and every time and anywhere you can call it by its serial number (1 here)

ADD REPLYlink written 9 months ago by cpad011211k

That's an extremely handy option. Thanks again :)

ADD REPLYlink written 9 months ago by vinayjrao110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 891 users visited in the last hour