Design a matrix from a list with use of R or linux
1
0
Entering edit mode
3.9 years ago
hosin • 0

Hello there, I have a list.txt (big file) contains 2000 samples and 18000 coordinates (same as below file 1).

Coordinates   Sample    Values
chr1:110238914-110324454          SampleB   1
chr1:110238914-110324454          SampleC   3
chr1:110238914-110324454          SampleD   1
chr5:65562670-65627908        SampleD   1
chr5:65562670-65627908        SampleA   1
chr5:65562670-65627908        SampleB   4
chr5:65562670-65627908        SampleC   1
chr2:158248715-158335919              SampleB   1
chr2:158248715-158335919              SampleA   0
chr2:158248715-158335919              SampleC   1

Actually I want to make a matrix by the above file. Whereas coordinates to be as rows name and samples as columns name, then if the coordinate has related sample put the related value in the matrix, if the coordinate does not the value for the sample just put 2 in the matrix, the result should be same the below.

Coordinates   SampleA    SampleB        SampleC         SampleD
chr1:110238914-110324454        2   1   3   1
chr5:65562670-65627908      1   4   1   1
chr2:158248715-158335919            0   1   1   2

I would really appreciate it , if I can receive any scripts for linux,bash (preferably) or R to get this result?

)

R • 2.5k views
ADD COMMENT
0
Entering edit mode

Relevant post from SO - "reshape long to wide":

ADD REPLY
2
Entering edit mode
3.9 years ago
Ram 44k

You can use tidyr::pivot_wider to get to what you need. Your problem here is the simplest case, so figuring out the exact usage from the manual should be easy enough.

ADD COMMENT
1
Entering edit mode

In case the OP runs into trouble, here's the exact code for their data. df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values").

ADD REPLY
0
Entering edit mode

Thanks for giving me this information, how to put 2 for samples which does not have the coordinate?

ADD REPLY
0
Entering edit mode

We need to apply complete before reshaping.

ADD REPLY
0
Entering edit mode

If I understand you correctly you can add the argument values_fill=2.

ADD REPLY
0
Entering edit mode

df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values", values_fill = 2) Alright , I found it in the manual, anyway thanks all

ADD REPLY
0
Entering edit mode

Yes, maybe edit the post, and add your complete solution.

ADD REPLY
0
Entering edit mode

Sorry for subsequent messages, I'm receiving this error several times:

Error in UseMethod("tbl_vars") : 
  no applicable method for 'tbl_vars' applied to an object of class "function"
ADD REPLY
0
Entering edit mode

Can you post your current code here?

ADD REPLY
0
Entering edit mode
df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values", values_fill = 2)
ADD REPLY
0
Entering edit mode

What code are you using to define df?

ADD REPLY
0
Entering edit mode

I just follow this manual: "https://tidyr.tidyverse.org/reference/pivot_wider.html" as mentioned above, that is the exact code. I did not consider data frame. This is our input file:

Coordinates   Sample    Values
chr1:110238914-110324454          SampleB   1
chr1:110238914-110324454          SampleC   3
chr1:110238914-110324454          SampleD   1
chr5:65562670-65627908        SampleD   1
chr5:65562670-65627908        SampleA   1
chr5:65562670-65627908        SampleB   4
chr5:65562670-65627908        SampleC   1
chr2:158248715-158335919              SampleB   1
chr2:158248715-158335919              SampleA   0
chr2:158248715-158335919              SampleC   1
ADD REPLY
0
Entering edit mode

How are you defining the data frame on which you're running the pivot_wider? Please show us as much of your code as you can, or we cannot really help you.

ADD REPLY
0
Entering edit mode
> mydat=read.table(file.choose())
> df <- read.table("list.txt", sep="\t", header=TRUE, stringsAsFactors=FALSE)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'list.txt': No such file or directory
ADD REPLY
0
Entering edit mode

You need to read the file into a data.frame named 'df' first. df <- read.table("file.txt", sep="\t", header=TRUE, stringsAsFactors=FALSE). Change the file name and delimiter as appropriate.

ADD REPLY
0
Entering edit mode

Sorry, previously I got this error

> mydat=read.table(file.choose())
> df <- read.table("list.txt", sep="\t", header=TRUE, stringsAsFactors=FALSE)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'list.txt': No such file or directory
ADD REPLY
0
Entering edit mode

That's a problem you can solve yourself using some Google. You're having problems reading the dataset, not processing it.

ADD REPLY

Login before adding your answer.

Traffic: 1671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6