Question: Colouring a string
0
gravatar for davide.chiarugi
10 weeks ago by
davide.chiarugi20 wrote:

Imagine I have a DNA sequence (e.g., dummy and short, ATTATGCGGGGAATTT ) and I would like to colour the different nucleotides with different coluours, given that I have a vector indicating the position of the nucleotides to be coloured:

(1,3,6,7,8,10) <- to be coloured in red

(2,4,5,12) <- to be coloured in green

How would you do that (if you don't want to do that manually) ?

sequence genome • 338 views
ADD COMMENTlink modified 10 weeks ago by jrj.healey9.2k • written 10 weeks ago by davide.chiarugi20

I had this idea once for schollboys and the only way I found is to go throught HTML code

<head>
<style TYPE="text/css"> 
    .A {
        color: red;
        font-family: monospace;
        font-size: 88px;
    }
    .C {
        color: green;
        font-family: monospace;
        font-size: 88px;
    }
    .G {
        color: orange;
        font-family: monospace;
        font-size: 88px;
    }
    .T { 
        color: blue;
        font-family: monospace;
        font-size: 88px;
    }
</style>

</head>
<span class="G">G</span><span class="C">C</span><span class="A">A</span><span class="T">T</span><span class="G">G</span><span class="C">C</span><span class="T">T</span><span class="A">A</span><span class="G">G</span><span class="C">C</span><span class="A">A</span><span class="G">G</span><span class="C">C</span><span class="T">T</span><span class="G">G</span><span class="T">T</span><span class="C">C</span><span class="A">A</span><span class="C">C</span><span class="G">G</span>

Color is manage by the CSS class.

This is hard code but you can develop a function to input a sequence and generate to appropriate <span class="X">X</span> for each base

ADD REPLYlink modified 9 weeks ago • written 10 weeks ago by Bastien Hervé2.7k

Do you want to use multiple different colour arrays?

ADD REPLYlink written 10 weeks ago by jrj.healey9.2k
5
gravatar for Russ
10 weeks ago by
Russ420
Ontario Veterinary College, University of Guelph, Guelph, Ontario, Canada
Russ420 wrote:

Here's a solution using R:

library(crayon)

string <- "ATTATGCGGGGAATTT"
sp <- strsplit(string, split = "")[[1]]
df <- data.frame("nucleotide" = as.character(sp), stringsAsFactors = F)

redVector <- c(1,3,6,7,8,10)
greenVector <- c(2,4,5,12)

df$ntColored <- df$nucleotide
df[redVector, "ntColored"] <- red(df[redVector, "ntColored"])
df[greenVector, "ntColored"] <- green(df[greenVector, "ntColored"])

cat(df$ntColored)

edit: just for fun, it's also easy to colour by letter:

df$byLetter <- ifelse(df$nucleotide == "A", df$byLetter <- blue("A"), 
       ifelse(df$nucleotide == "C", df$byLetter <- red("C"),
              ifelse(df$nucleotide == "G", df$byLetter <- green("G"),
                    df$byLetter <- yellow("T"))
              )
       )

cat(df$byLetter)
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Russ420
1

Thank you so much: I am actually making the rest of the analysis in R and this turns out to be the best solution for me !

ADD REPLYlink written 10 weeks ago by davide.chiarugi20

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they all work.
Upvote|Bookmark|Accept

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by genomax59k

And now, the further step: how would you save the output in a (image) file ? With the options I have tried I can export only a black string and not the coloured one ....

ADD REPLYlink written 9 weeks ago by davide.chiarugi20
1

How do you intend to use the output? The answers here depend on the use of so-called ‘escape’ sequences which are invisible characters which terminals that support colour interpret.

These characters are not supported in every possible application though. I’m not personally aware of any image editors that support them natively.

The best solution I can think of is a screenshot?

You can view the escape characters but piping the output of the tools (maybe not the R one, I’m not 100% sure how that one works), to cat -v

E.g.

$ Colorise_script -arg ATGC | cat -v
ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by jrj.healey9.2k

you original question is badly formulated. First you ask for color, then you say you want R and now you say you want to save the image....

ADD REPLYlink written 9 weeks ago by Pierre Lindenbaum115k

Apologies for the bad formulation. I asked for colours in general and different solutions have been proposed. Among these, I have followed what looked the most suitable for me. What I am aiming to do is:

  1. create the coloured string, as reported in the original question
  2. save the string in a file (possibly an image)

Thanks

ADD REPLYlink modified 9 weeks ago by RamRS19k • written 9 weeks ago by davide.chiarugi20

Yes, but what do you actually want to do with it downstream?

Is it just to put in presentations or something?

ADD REPLYlink written 9 weeks ago by jrj.healey9.2k

Yes, at the end I would like to obtain a figure to put in a paper/presentation.

ADD REPLYlink written 9 weeks ago by davide.chiarugi20

From what I can find, there is no better option than screenshotting the output.

It is theoretically possible to pipe STDOUT from the terminal, as this post explains. The only option to support colours however is enscript, which would mean you could only generate postscript files. enscripts colouration escape sequences are also not the same as an xterm's, so an intermediate script to transliterate everything would be needed.

In short, its f*cking difficult.

The alternative would be to start from scratch in a language which has some support for creation of images, but this then becomes less about text manipulation, and more of a rendering problem, and none of the solutions here are in that vane.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by jrj.healey9.2k

This is convoluted, but you can use the textGrob function from grid package in R. You can create a textGrob, which is a ggplot-like object that just contains text. You'll need to figure out the coloring, but once you create the textGrob, you can ggsave the textGrob object to get your image.

Good luck!

ADD REPLYlink written 9 weeks ago by RamRS19k

OP, you should have specified that "save to file" part at the outset. Colors depend on the renderer, not the file itself (of course, image and pdf files are the way to achieve portability). By leaving that out, people have spent their time helping you without the actual goal available to them. This kind of formulation frustrates people and makes them less inclined to help you out subsequently/follow up on questions you might have.

ADD REPLYlink written 9 weeks ago by RamRS19k

I apologise again for the bad formulation of my question. Initially, I thought that the main problem was to generate the string and not to save the output and, thus, I preferred not to bother people about the second issue. My bad.

ADD REPLYlink written 9 weeks ago by davide.chiarugi20

Yeah, I agree with jrj.healey, I'd probably just take a screenshot...

ADD REPLYlink written 9 weeks ago by Russ420

Tank you to everyone for your help: at the end, I thik I'll opt out for the screenshot solution. I'll try also RamRS suggestion about textGrob and, if I'll obtain some interesting results, I'll let you know.

Again, thank you !!!

ADD REPLYlink written 9 weeks ago by davide.chiarugi20
4
gravatar for Pierre Lindenbaum
10 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum115k wrote:

in C using ANSI escape codes

ADD COMMENTlink written 10 weeks ago by Pierre Lindenbaum115k
3
gravatar for jrj.healey
10 weeks ago by
jrj.healey9.2k
United Kingdom
jrj.healey9.2k wrote:

A pure bash option (because I apparently have nothing better to do).

Note that this script will not be particularly forgiving for different specifications on the command line...

# Usage:
#  $  bash col_seq.sh <Sequence> <red> <green> <yellow> <blue>
#
# Indexes must be provided as a comma separated quoted string, e.g:
#  $  bash col_seq.sh ATGTACGATCG "1,2" "3,4" "5,6" "7,8"
#
# You can miss a colour out, but will need to specify empty quotes: ""
#  $  bash col_seq.sh ATGTACGATCG "1,2" "3,4" "" "7,8"

in_array() {
 ARRAY=$2
 for e in ${ARRAY[*]} ; do
  if [[ "$e" == "$1" ]] ; then
   return 0
  fi
  done
 return 1
}

red(){
printf "\e[31m$1\e[0m"
}
green(){
printf "\e[32m$1\e[0m"
}
yellow(){
printf "\e[33m$1\e[0m"
}
blue(){
printf "\e[34m$1\e[0m"
}

string=$(echo "$1" | tr '[:lower:]' '[:upper:]')
IFS=',' read -r -a Rarray <<< "$2"
IFS=',' read -r -a Garray <<< "$3"
IFS=',' read -r -a Barray <<< "$4"
IFS=',' read -r -a Yarray <<< "$5"


for i in $(seq 1 "${#string}") ; do
  if in_array "$i" "${Rarray[*]}" ; then
   red "${string:i-1:1}"
  elif in_array "$i" "${Garray[*]}" ; then
   green "${string:i-1:1}"
  elif in_array "$i" "${Yarray[*]}" ; then
   yellow "${string:i-1:1}"
  elif in_array "$i" "${Barray[*]}" ; then
   blue "${string:i-1:1}"
  else
   printf "${string:i-1:1}"
  fi
done
printf "\n"

Now you can do assorted bash magic:

Screen_Shot_2018_10_10_at_10_48_43

Edit:

I got carried away...

ADD COMMENTlink modified 9 weeks ago • written 10 weeks ago by jrj.healey9.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1315 users visited in the last hour