Question: Display Nucleotides As Color
5
gravatar for John
7.6 years ago by
John70
United States
John70 wrote:

Hi,

Is there an easy way to display a sequence like "ATCC" as "red blue green green" colors on a figure, when red = A, blue = T, and green = C? I am thinking something like a heatmap in R if I can assign color to discrete values. Thanks.

nucleotide • 3.8k views
ADD COMMENTlink modified 3.4 years ago by Ibrahim Tanyalcin930 • written 7.6 years ago by John70
2

what do you mean "On a figure" ?

ADD REPLYlink written 7.6 years ago by Pierre Lindenbaum119k

What I meant was to display ATCC as colored squares in a row. Sort of like this figure. http://realtamortgage.com/gfx/colors.gif

ADD REPLYlink written 7.5 years ago by John70
8
gravatar for Giovanni M Dall'Olio
7.5 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

Ugly & quick HTML hack:

  • paste your sequence on a file, e.g. seq.dna
  • transform it to HTML, e.g. through sed:

    sed 's/[ACTG]/&<\/span>/gi' seq.color > seq.html
    
  • Attach a stylesheet to it, e.g.:

    <head>
    <style TYPE="text/css"> 
      .A {
         color: red;
         background: red;
         font-family: monospace;
         font-size: 40px;
      }
      .C {
         color: green;
         background: green;
         font-family: monospace;
         font-size: 40px;
      }
      .G {
         color: orange;
         background: orange;
         font-family: monospace;
         font-size: 40px;
      }
      .T { 
         color: blue;
         background: blue;
         font-family: monospace;
         font-size: 40px;
      }
    </style>
    </head>
    
  • The result file should look like:

    <head>
    <style TYPE="text/css"> 
      .A {
         color: red;
         background: red;
         font-family: monospace;
         font-size: 40px;
      }
      .C {
         color: green;
         background: green;
         font-family: monospace;
         font-size: 40px;
      }
      .G {
         color: orange;
         background: orange;
         font-family: monospace;
         font-size: 40px;
      }
      .T { 
         color: blue;
         background: blue;
         font-family: monospace;
         font-size: 40px;
      }
    </style>
    </head>
    
    <span class="A">A</span><span class="G">G</span><span class="G">G</span><span class="C">C</span><span class="T">T</span><span class="T">T</span><span class="T">T</span><span class="A">A</span><span class="G">G</span><span class="t">t</span><span class="g">g</span><span class="c">c</span><span class="A">a</span>
    
  • Open in a web browser

  • example
ADD COMMENTlink modified 6 months ago by RamRS21k • written 7.5 years ago by Giovanni M Dall'Olio26k

Thanks. Sorry I didn't make it clearer. This is what I meant to look like. http://realtamortgage.com/gfx/colors.gif

ADD REPLYlink written 7.5 years ago by John70

@John: ah, ok! well, you can simply add a background of the same color. I'll update the examples.

ADD REPLYlink written 7.5 years ago by Giovanni M Dall'Olio26k

Thanks Giovanni!

ADD REPLYlink written 7.5 years ago by John70

Sorry, I forgot that you should also use a Monospace font.

ADD REPLYlink written 7.5 years ago by Giovanni M Dall'Olio26k
3
gravatar for Madelaine Gogol
7.5 years ago by
Madelaine Gogol5.0k
Kansas City
Madelaine Gogol5.0k wrote:

Well... I'm not sure what you mean by "on a figure", but in the past, I've done this with HTML like Giovanni said or (slightly dumber) with a script that puts a <font color=\"#FF0000\"></font> around all A's or whatever.

In R, you can use the text() command to put text on a plot or just use pch='A' and color='red' to make points on a plot red A's, for example.

seq<-"ATCGTACG"
seqlist<-strsplit(seq,"")
cols<-c('red','blue','green','purple')
plot(1:length(seqlist[[1]]),rep(1,times=length(seqlist[[1]])),pch=seqlist[[1]],col=cols[factor(seqlist[[1]])])

This would take some fiddling for a longer sequence, but you get the idea.

EDIT: after reading your comment above, it's actually easier.

seq<-"ATCGTACG"
seqlist<-strsplit(seq,"")
cols<-c('red','blue','green','purple')
image(matrix(as.numeric(factor(seqlist[[1]]))),col=cols)
ADD COMMENTlink modified 6 months ago by RamRS21k • written 7.5 years ago by Madelaine Gogol5.0k

Thanks. I will try it.

ADD REPLYlink written 7.5 years ago by John70
3
gravatar for Pierre Lindenbaum
7.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

A quick hack (not fully tested, but you'll get the idea): the following C program will generate a postscript file with the colored rectangles:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <ctype.h>

int main(int argc,char** argv)
    {
    int i,j,k,n;
    double SIZE=500.0;
    double side=0;
    int c;
    int len=0;
    char* s=malloc(sizeof(char));

    if(s==NULL) return EXIT_FAILURE;
    while((c=fgetc(stdin))!=EOF)
        {
        if(isspace(c)) continue;
        s=realloc(s,sizeof(char)*(len+2));
        if(s==NULL)
            {
            fprintf(stderr,"Out of memory\n");
            return EXIT_FAILURE;
            }
        s[len++]=c;
        }
    s[len]=0;
    if(len==0) return EXIT_FAILURE;
    n=ceil(sqrt(len));
    side=SIZE/n;
    k=0;
    printf("%%!PS\n");
    printf("/dside 100 def\n");
    printf("/box { 2 dict begin /y exch def /x exch def "
        "newpath " 
        "y dside mul x dside mul moveto "
        "dside 0 rlineto "
        "0 dside rlineto "
        "dside -1 mul 0 rlineto "
        "0 dside -1 mul rlineto "
        "closepath "
        "fill "
        "  end} bind def\n");
    printf("/red   {  1 0 0 setrgbcolor  box } bind def\n");
    printf("/green {  0 1 0 setrgbcolor  box } bind def\n");
    printf("/blue  {  0 0 1 setrgbcolor  box } bind def\n");
    printf("/yellow  {  1 0 1 setrgbcolor  box } bind def\n");
    printf("/black {  0 0 0 setrgbcolor  box } bind def\n");
    for(i=0;i< n && k<len;i++)
        {
        for(j=0;j<n && k<len;++j)
            {
            printf("%d %d",i,j);
            switch(toupper(s[k++]))
                {
                case 'A': fputs(" red\n",stdout); break;
                case 'T': fputs(" green\n",stdout); break;
                case 'C': fputs(" yellow\n",stdout); break;
                case 'G': fputs(" blue\n",stdout); break;
                default: fputs(" black\n",stdout); break;
                }
            }
        }
    printf("showpage\n");
    return 0;
    }

Compilation:

gcc -o biostar12763 -Wall source.c -lm

Execution:

echo "ATAGCTAGCATCAGTCTAGCTTAGCTAGCGCNNACTAGCT" | ./biostar12763   > file.ps
ghostview file.ps ## or evince file.ps or... etc...
ADD COMMENTlink modified 6 months ago by RamRS21k • written 7.5 years ago by Pierre Lindenbaum119k
2
gravatar for Alastair Kerr
7.5 years ago by
Alastair Kerr5.2k
The University of Edinburgh, UK
Alastair Kerr5.2k wrote:

JalView is excellent for creating figures of proteins and nucleotides. Even if you do not have an alignment, you can still enter a single sequence. Lots of export options as well including wrapped text and export to a pdf.

ADD COMMENTlink written 7.5 years ago by Alastair Kerr5.2k

Thanks. A little different than what I want.

ADD REPLYlink written 7.5 years ago by John70
1
gravatar for Yumtaoist
7.6 years ago by
Yumtaoist70
Yumtaoist70 wrote:

I don't know how to make this with R, but I think you can open the sequences with mega or clustalX, in which the nucleotides are colored, and then get a screenshot.

ADD COMMENTlink written 7.6 years ago by Yumtaoist70

Thanks for the suggestion. I am trying to make this into a pipeline. Fewer separate programs would be better for me.

ADD REPLYlink written 7.5 years ago by John70
0
gravatar for David W
7.5 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

This is pretty hacktastic, but I don't know a better way

library(ggplot2)
dna <- "ATAGCATCGACTAG"
bases <- unlist(strsplit(dna, ""))
col_scheme <- c("red", "green", "yellow", "blue")
names(col_scheme) <- c("A", "T" ,"C", "G")
p <- qplot(1:length(bases), 1, fill=col_scheme[bases])
p + geom_tile() + scale_fill_identity()

You should really think about if you want to use (simple) red and green in the same plot - ~8% of males can't tell the difference. Is there a good reason for colouring these bases but ignoring those people?

ADD COMMENTlink modified 6 months ago by RamRS21k • written 7.5 years ago by David W4.7k

whoops, hadn't noticed mmarchin's answer which is more or less the same as this, but with base graphics and is less hack-tastic :)

ADD REPLYlink written 7.5 years ago by David W4.7k
0
gravatar for Ibrahim Tanyalcin
3.4 years ago by
Belgium
Ibrahim Tanyalcin930 wrote:

Dear,

If you are working on proteins, you can use I-PV just as shown in the link here.

You will need to make your sequence file in a txt editor, ms excel etc. Here is an example.

You can visit the main website for more information.

I hope this helps,

Good luck with your research,

ADD COMMENTlink modified 6 months ago by RamRS21k • written 3.4 years ago by Ibrahim Tanyalcin930
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 762 users visited in the last hour