Design a matrix from a list with use of R or linux
1
0
Entering edit mode
5 months ago
hosin • 0

Hello there, I have a list.txt (big file) contains 2000 samples and 18000 coordinates (same as below file 1).

Coordinates   Sample    Values
chr1:110238914-110324454          SampleB   1
chr1:110238914-110324454          SampleC   3
chr1:110238914-110324454          SampleD   1
chr5:65562670-65627908        SampleD   1
chr5:65562670-65627908        SampleA   1
chr5:65562670-65627908        SampleB   4
chr5:65562670-65627908        SampleC   1
chr2:158248715-158335919              SampleB   1
chr2:158248715-158335919              SampleA   0
chr2:158248715-158335919              SampleC   1


Actually I want to make a matrix by the above file. Whereas coordinates to be as rows name and samples as columns name, then if the coordinate has related sample put the related value in the matrix, if the coordinate does not the value for the sample just put 2 in the matrix, the result should be same the below.

Coordinates   SampleA    SampleB        SampleC         SampleD
chr1:110238914-110324454        2   1   3   1
chr5:65562670-65627908      1   4   1   1
chr2:158248715-158335919            0   1   1   2


I would really appreciate it , if I can receive any scripts for linux,bash (preferably) or R to get this result?

)

R • 401 views
0
Entering edit mode

Relevant post from SO - "reshape long to wide":

2
Entering edit mode
5 months ago
Ram 33k

You can use tidyr::pivot_wider to get to what you need. Your problem here is the simplest case, so figuring out the exact usage from the manual should be easy enough.

1
Entering edit mode

In case the OP runs into trouble, here's the exact code for their data. df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values").

0
Entering edit mode

Thanks for giving me this information, how to put 2 for samples which does not have the coordinate?

0
Entering edit mode

We need to apply complete before reshaping.

0
Entering edit mode

If I understand you correctly you can add the argument values_fill=2.

0
Entering edit mode

df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values", values_fill = 2) Alright , I found it in the manual, anyway thanks all

0
Entering edit mode

0
Entering edit mode

Sorry for subsequent messages, I'm receiving this error several times:

Error in UseMethod("tbl_vars") :
no applicable method for 'tbl_vars' applied to an object of class "function"

0
Entering edit mode

Can you post your current code here?

0
Entering edit mode
df <- tidyr::pivot_wider(df, names_from="Sample", values_from="Values", values_fill = 2)

0
Entering edit mode

What code are you using to define df?

0
Entering edit mode

I just follow this manual: "https://tidyr.tidyverse.org/reference/pivot_wider.html" as mentioned above, that is the exact code. I did not consider data frame. This is our input file:

Coordinates   Sample    Values
chr1:110238914-110324454          SampleB   1
chr1:110238914-110324454          SampleC   3
chr1:110238914-110324454          SampleD   1
chr5:65562670-65627908        SampleD   1
chr5:65562670-65627908        SampleA   1
chr5:65562670-65627908        SampleB   4
chr5:65562670-65627908        SampleC   1
chr2:158248715-158335919              SampleB   1
chr2:158248715-158335919              SampleA   0
chr2:158248715-158335919              SampleC   1

0
Entering edit mode

How are you defining the data frame on which you're running the pivot_wider? Please show us as much of your code as you can, or we cannot really help you.

0
Entering edit mode
> mydat=read.table(file.choose())
Error in file(file, "rt") : cannot open the connection
In file(file, "rt") :
cannot open file 'list.txt': No such file or directory

0
Entering edit mode

You need to read the file into a data.frame named 'df' first. df <- read.table("file.txt", sep="\t", header=TRUE, stringsAsFactors=FALSE). Change the file name and delimiter as appropriate.

0
Entering edit mode

Sorry, previously I got this error

> mydat=read.table(file.choose())
Error in file(file, "rt") : cannot open the connection
In file(file, "rt") :
cannot open file 'list.txt': No such file or directory

0
Entering edit mode

That's a problem you can solve yourself using some Google. You're having problems reading the dataset, not processing it.