filtering column based on column header from a file
1
0
Entering edit mode
3.5 years ago
Bioinfonext ▴ 460

Hi,

I wanted to filter multiple columns based on the column header name from a large count data file.

Count data file;
count.txt  


                                 Soil.15.S8.L001             Soil.16.S9.L001      Soil.2.S1.L001        
d2ec9f3b77975c0f457e4b7413b217ff            84                      63            106                    
3147790f0d5a78316fb9dd64f53b9473            69                      49            95                  
97aecc1f35cc1f50db507ad71dd22367             0                      15            14               
bfad6370d28182cc6304844e9bec7fb6            271                     75            30

...

Column name file;
column.txt

Soil.15.S8.L001,Soil.16.S9.L001,Soil.2.S1.L001.........

.

Could you please suggest to me how I can get the above column name data from the count data file

Many thanks

bash linux R • 1.2k views
ADD COMMENT
0
Entering edit mode

I'm having trouble understanding what you want to do. Can you explain your question in more detail, and possibly provide a small example of the inputs and expected outputs?

ADD REPLY
0
Entering edit mode

sorry for that, I have updated it now with more details.

ADD REPLY
0
Entering edit mode

try:

df=iris
vec=names(iris)[1:2]
df[,vec]

#with dplyr

dplyr:: select(df,vec)
ADD REPLY
2
Entering edit mode
3.5 years ago
Sam ★ 4.7k

Assuming you are trying to extract the columns by column names, you can do the following with data.table

library(data.table)
col <- unlist(fread("column.txt", header=F))
count <- fread("count.txt")
extracted <- count[,col, with=F]

If you don't have data.table install, you can do

col <- read.table("column.txt", sep=",")
count <- read.table("count.txt", header=T)
extracted <- count[,colnames(count) %in% col[1,]]
ADD COMMENT
0
Entering edit mode

thanks, data.table command working well, but I am not getting the identifier as first column like count data file.

    Soil.9.S42  Soil-27.S33.L001    Soil-45.S54.L001    Soil-6.S22.L001 
1       34        51                      84                    41  
2       28        32                      52                    27  
3       33         0                      7                     12
ADD REPLY
1
Entering edit mode

I am not sure what your identifier column name is, but assuming it is "ID" you can do

col <- c("ID", col)

And continue

ADD REPLY

Login before adding your answer.

Traffic: 1355 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6