Question: combining rows in sequance file
0
Tohamy80 wrote:

Dear All, I do not have much experience with R and I need your help.

I have a data from like this:
1_1  A  B  C  D
1_2  a  b  c  d
2_1  E  F  G  H
2_2  e  f  g   h
3_1  I   J   K  L
3_2  i   j   k   l

I want to combine each tow rows like this

1  A a B b C c D d
2  E e F f G g H h
3  I i J j K k L l

How can I do this?

sequencing R • 1.4k views
modified 6.0 years ago by Michael Lawrence90 • written 6.0 years ago by Tohamy80

Given the initial 6x4 matrix, do you want a 3x4 or a 3x8 matrix output? That's rather ambiguous from your presentation. I assume you want the latter, but perhaps not.

2
Michael Lawrence90 wrote:

Let's assume that your data.frame is homogeneous in type and can be coerced to a matrix:

`m <- as.matrix(df)`

Then, the most efficient route is array manipulation:

```a <- aperm(array(m, c(2L, nrow(m)/2L, ncol(m))), c(1L, 3L, 2L)) m2 <- matrix(a, dim(a)[3L], byrow=TRUE)```

Hey Mr Lawrence,
Your script works perfectly with the toy of data set but when I try to use it with my real data it does not work so well. My data consists of 174 individuals. Each individuals has two lines and in each line these are 120693 SNPs:

> m<-as.matrix(h)
> a <- aperm(array(m, c(2L, nrow(m)/2L, ncol(m))), c(1L, 3L, 2L))
> m2 <- matrix(a, dim(a)[3L], byrow=TRUE)
> View(m2)

it gives me one line for each individual but they are seprated  like this:

Ind1_1  T C G C T   Ind1_2  T C G C T

Ind2_1  T C G C T   Ind2_2  T C G C T

Ind3_1  T C G C T   Ind3_2  T C G C T

not like in the toy

> m3 <- matrix(a, dim(a)[3L], byrow=TRUE)
> m3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] "A"  "a"  "B"  "b"  "C"  "c"  "D"  "d"
[2,] "E"  "e"  "F"  "f"  "G"  "g"  "H"  "h"
[3,] "J"  "j"  "K"  "k"  "L"  "l"  "M"  "m"

Note: It works without any error or warrning messages with my real data but at the end  it dose  not give me the rquired format.

Thanks and I am really appreciated to your help.

Best

It's generally more useful if you say how it doesn't match the format you need (and then show the output and what it should be like).

1
Sukhdeep Singh10k wrote:

Not a bionformatics question, but
use `apply`

The way you are formatting is a bit hard to get it and what exactly you want to do.

Considering, your data is symmetric as you showed few lines there.

```where dat is your matrix
```

The code might change with the full matrix, but I leave it to you to implement.

First of all, thanks for your help.  but

`apply( dat[ , colnames(dat) ] , 1 , paste , collapse = " " )`

`does not wrork well with my case or `may be I did not explain my question very well.

I need to combine each tow rwos for the same individual in on row. For example I need to have 3 rows instead 6 rows not one row as        1_1       1_2       2_1       2_2       3_1       3_2.
"A B C D" "a b c d" "E F G H" "e f g h" "J K L M" "j k l m"

Also, it does not megre the columns like what I need. I need it like this A a B b C c D d and soon for other individuals.

Thanks you so much and sorry for disrubting you. But I ineed to do that for a lrage file that contains 174 individuals and 120000 SNPs.

> dat
V2 V3 V4 V5
1_1  A  B  C  D
1_2  a  b  c  d
2_1  E  F  G  H
2_2  e  f  g  h
3_1  J  K  L  M
3_2  j  k  l  m
> l<- apply( dat[ , colnames(dat) ] , 1 , paste , collapse = " " )
> l
1_1       1_2       2_1       2_2       3_1       3_2
"A B C D" "a b c d" "E F G H" "e f g h" "J K L M" "j k l m"