Question

issues using Value Matching command in R

0

Entering edit mode

4.3 years ago

Adeler001 • 0

Hello I am trying to use the Value Matching command in R to find a specific ensemble exon code in my exon count table generated by featurecounts for my RNAseq data . Everything works until I used the %in% command. Here is the script I used:

tab= read.table("exons_RNA-seq_sorted.csv", header=F, sep="\t")
tab= tab[-c(1), ]
line1 = tab[1,]
line1b = as.matrix(line1)
colnames(tab) = line1b
tab= tab[-c(1), ]


tab = tab[tab$'ENSG00000223972.4'%in%'Geneid',]
printFilter(tab, 'variants in CTNNA1 for D4440')
tab=zzef1

R RNA-Seq • 800 views

ADD COMMENT • link 4.3 years ago by Adeler001 • 0

0

Entering edit mode

Please provide example data, couple of rows from your file - exons_RNA-seq_sorted.csv.

Why not use read.csv and keep the headers?

Where does 'Geneid' come from, is it a variable, or a string?

What packages are loaded - printFilter ?

ADD REPLY • link 4.3 years ago by zx8754 11k

0

Entering edit mode

Geneid is one of the column headers of my table. I didn't think of using the read.csv command. i didn't install any packages . i just tried the read.csv command and i get this error message : Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names

ADD REPLY • link 4.3 years ago by Adeler001 • 0

0

Entering edit mode

here is what my table looks like

Geneid                                   Chr    Start      end       Strand   Length    D4001   D4002    D4003   D4004    D4005 D4006
ENSG00000223972.4           chr1  11869    12227      +            35            8             22           33         44           55     66

ADD REPLY • link 4.3 years ago by Adeler001 • 0

0

Entering edit mode

tab[tab[,1]=="ENSG00000223972.4",]

You have to check that the first column is of character type. I also post the link to a similar question on SO, where you can get further help.

ADD REPLY • link 4.3 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

So it is not a CSV file. Not sure what we are trying to do, if we are trying to subset based on Geneid value, then try this example:

# example data (in your case you would be reading the file.)
df1 <- read.table(text = "
Geneid                                   Chr    Start      end       Strand   Length    D4001   D4002    D4003   D4004    D4005 D4006
ENSG00000223972.1           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.2           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.3           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.4           chr1  11869    12227      +            35            8             22           33         44           55     66
           ", header = TRUE)

df1[ df1$Geneid == "ENSG00000223972.4", ]
#              Geneid  Chr Start   end Strand Length D4001 D4002 D4003 D4004 D4005 D4006
# 4 ENSG00000223972.4 chr1 11869 12227      +     35     8    22    33    44    55    66

ADD REPLY • link 4.3 years ago by zx8754 11k