How to arrange column one below another from side by side in a matrix?
Entering edit mode
3.5 years ago
sankadinesh ▴ 20

I have a distance matrix file (Excel format) like this. I want to arrange columns to one below another from side by side. I have attached example file. I will apply your suggestions to larger dataset.

Please give your inputs and suggestions. Thanks in advance

Regards, Dinesh S L

sequence gene sequencing R • 940 views
Entering edit mode

Use awk, not R. Save the file as a TSV and work on that. For each column, save that column's header in a variable and for each non blank row in that column, print the header-variable followed by the "cell" content.

Try the above yourself and let us know if you have specific questions.

Entering edit mode
3.5 years ago
Tm ★ 1.1k

Assuming you have tsv file.

This should work for file with no header in all columns:

awk  'BEGIN {ORS=""} { for (i=1; i<=NF ; i++) {my_dict[i]=my_dict[i]$i"\n"}} END { for (k in my_dict) {print my_dict[k]}}' matrix.txt

However, if there is header in all columns and you want to skip it then you can use it with small modification:

awk 'BEGIN {ORS=""} NR>1 { for (i=1; i<=NF ; i++) {my_dict[i]=my_dict[i]$i "\n"}} END { for (k in my_dict) print my_dict[k] }' matrix.txt

It basically loop over the array's keys and extract the corresponding values.

Entering edit mode

Your solution worked like charm. However the code starts to read from column 23 instead of 1. I have to cut and paste to get it in right order. Thanks a lot Dinesh

Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.


Entering edit mode
3.5 years ago

Using awk is definitely the way to go if this is all you need to do, but I figured I would post a little R solution since you asked, and in case you or anyone wanted it.

Example data.

df <- structure(list(X1 = c(34L, 2L, 88L, 90L, 52L), X2 = c(NA, 74L, 
80L, 67L, 44L), X3 = c(NA, NA, 65L, 46L, 73L), X4 = c(NA, NA, 
NA, 73L, 71L), X5 = c(NA, NA, NA, NA, 50L)), class = "data.frame", row.names = c(NA, 

> df
  X1 X2 X3 X4 X5
1 34 NA NA NA NA
2  2 74 NA NA NA
3 88 80 65 NA NA
4 90 67 46 73 NA
5 52 44 73 71 50

tidyverse solution


# To read in your excel file.
# df <- read_excel("path/to/file")

df <- df %>%
  pivot_longer(everything(), names_to="column", values_to="value") %>%
  drop_na %>%

You should then have a data.frame from wide to long format.

> df
# A tibble: 15 x 2
   column   val
   <chr>  <int>
 1 X1        34
 2 X1         2
 3 X1        88
 4 X1        90
 5 X1        52
 6 X2        74
 7 X2        80
 8 X2        67
 9 X2        44
10 X3        65
11 X3        46
12 X3        73
13 X4        73
14 X4        71
15 X5        50

Login before adding your answer.

Traffic: 1696 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6