Question

Weighted sequence logos and motifs

0

Entering edit mode

5.1 years ago

Sergio Martínez Cuesta ▴ 230

Dear all,

Most libraries and software aimed at obtaining DNA sequence logos (e.g. ggseqlogo) or discovering sequence motifs (e.g. MEME tools) take as an input a fasta file containing a list of sequences:

>seq1
AGATCATCATCTCAT
>seq2
GTCTAGCTACGTACT
>seq3
TGCATGCATGCATCC

(in the case of motif finding, a list of negative sequences is often used as well)

However my list of sequences contain individual scores for each of my input sequences:

>seq1 53.4
AGATCATCATCTCAT
>seq2 21.5
GTCTAGCTACGTACT
>seq3 11.8
TGCATGCATGCATCC

I was wondering if anyone is aware of any tools that would take into account the sequence scores (53.4, 21.5, 11.8) to guide the creation of sequence logos or discovery of motifs.

Any hints would be quite useful.

logo motif • 1.6k views

ADD COMMENT • link 5.1 years ago by Sergio Martínez Cuesta ▴ 230

2

Entering edit mode

Maybe to duplicate the sequences based on the weight as the input?

ADD REPLY • link 5.1 years ago by Sishuo Wang ▴ 230

0

Entering edit mode

That could work! But when adding sequences I would have to round decimal numbers to integers, which could result in a huge number of sequences after all, however this may not be a problem here.

ADD REPLY • link 5.1 years ago by Sergio Martínez Cuesta ▴ 230

0

Entering edit mode

Have you tried this? http://fraenkel-nsf.csbi.mit.edu/webmotifs-tryit.html https://academic.oup.com/nar/article/35/suppl_2/W217/2923614

ADD REPLY • link 5.1 years ago by pltbiotech_tkarthi ▴ 180

1

Entering edit mode

tools that would take into account the sequence scores

Neither of the linked tools does. Therefore moved to a comment. It is appreciated that you aim to provide help but if you simply and only link content that matches the topic of the top-level question rather than answering what OP asked for, it simply does not help. Please stop doing that.

ADD REPLY • link 5.1 years ago by ATpoint 81k

0

Entering edit mode

Thank you, I had a read through the docs. Even though you can input what they call seeds, I could not find a way to incorporate sequence scores into the motif discovery.

ADD REPLY • link 5.1 years ago by Sergio Martínez Cuesta ▴ 230