chi-square test using vectors - and trying to for loop
0
0
Entering edit mode
6 months ago
ohtang7 ▴ 40

I made the code for chisquare-test as below in R

# Load the necessary libraries
library(readxl)
library(tidyverse)

# Read the data from the Excel file
data <- read_excel('Chisquare_test.xlsx', sheet = "Sheet1")

# Extract the taxa names
taxa_names <- colnames(data)[-c(1, 20)]

# Create an empty matrix to store p-values
num_taxa <- length(taxa_names)
p_values <- matrix(NA, nrow = num_taxa, ncol = 1)

# Loop through the taxa and conduct a chi-square test for each taxon
for (i in 1:num_taxa) {
  # Create a contingency table for the taxon and the categorical variable
  contingency_table <- table(data[, i + 2], data[, 20])

  # Conduct the chi-square test
  chi_sq_result <- chisq.test(contingency_table)$p.value

  # Store the p-value in the matrix
  p_values[i, 1] <- chi_sq_result
}

# Create row names for the matrix
rownames(p_values) <- taxa_names

# Print the p-values matrix
print(p_values)

My raw data in excel 1st column is ID, 20th column is categorical variable. I want 18 chi-square results from taxon using for loop easily.

However, the error

Error in xtfrm.data.frame(x) : 
  (converted from warning) cannot xtfrm data frames

after the line contingency_table <- table(data[, i + 2], data[, 20])

I found that both data[, i + 2] and data[, 20] have the same length in the raw file, but R says they're different.

  1. How should I think this (I can't understand why the error happens) and revise my code?
  2. I found that chisq.test(data$Acanthamoeba,data$Water) works but chisq.test(data[,2],data[,20]) doesn't work - specifying first columns works, but indicating data column sequence doesn't work. Why is that?

The raw material can be found from the link below

https://docs.google.com/spreadsheets/d/1EPah3-aoGf7hvqpbS-w_UiguSKEQihpE/edit?usp=sharing&ouid=108716001644013408238&rtpof=true&sd=true

R chi-square • 382 views
ADD COMMENT
0
Entering edit mode

What is the output to:

class(data) #I think you should rename this variable; "data" is a bad name for an object
colnames(data)
colnames(data)[c(2,20)]
ADD REPLY

Login before adding your answer.

Traffic: 1411 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6