Modify script in R to change read depth
1
0
Entering edit mode
7.5 years ago
KP • 0

Hi, I want to ask as to how to modify the script in R so that it runs the data at sequencing depth 30M instead of 2.5M. The current data reads at 2.5M. Appreciate it.Thanks.

RNA-Seq • 1.4k views
ADD COMMENT
1
Entering edit mode

"the script". That's fairly unspecific. Please add more information to your question.

ADD REPLY
0
Entering edit mode

Hi, I am very new to R. Here is the script: can I change it to 30M by changing whereevr it says 2.5? Appreciate it.

setwd("C:/Users/KP/Documents/RNA SEQ")
countTable <- read.table("count_matrix_sub_2.5M.txt", header=TRUE, sep="\t", row.names=1)
#countTable2 <- read.table("count_matrix_sub.txt", header=TRUE, sep="\t", row.names=1)

# These substrings of column names indicate the 7 replicates for control sample: 
ControlCistrackID <- c("2012.562","2012.563","2012.564","2012.565","2012.566","2012.568","2012.569")

# These substrings of column names indicate the 7 replicates for sample treated with estrogen:
E2CistrackID <- c("2012.570","2012.571","2012.572","2012.574","2012.575","2012.576","2012.577" )

# These substrings of column names indicate the 3 different depth of reads 
#   for each of the replicates of both untreated and treated:
readDepth <- c("2.5M") # bases sequenced per sample (actually down-sampled from 30M)
numRep <- 7   # 7 repetitions each of untreated, treated

# Produce  dataset for read depth ("2.5M")
# The dataset should include 7 replicates of untreated and 7 replicates oftreated.
makeDataSet<- function ( data, controlIDs, E2IDs, readDepth){
  controlReplicates <- paste( "X", controlIDs, ".subsamp.", sep="")
  controlReplicates_readDepth <- paste(controlReplicates, readDepth, sep="")
  E2Replicates <- paste( "X", E2IDs, ".subsamp.", sep="")
  E2Replicates_readDepth <- paste(E2Replicates, readDepth, sep="")  
  dataSet <- data[, c(controlReplicates_readDepth, E2Replicates_readDepth)]
  return(dataSet)
}


dataset_2.5 <- makeDataSet(countTable, ControlCistrackID, E2CistrackID, "2.5M")
sampleCondition<- c(rep("untreated",numRep), rep("treated",numRep) )
sampleTable<-data.frame(sampleName=colnames(dataset_2.5), condition=sampleCondition)
deseq2dataset <- DESeqDataSetFromMatrix(countData = dataset_2.5,
                                        colData = sampleTable,
                                        design = ~ condition)
#f <- factor(sampleCondition)
ADD REPLY
0
Entering edit mode

I'm not sure what the use of this code is. Do you know what you want to do with it? What is the objective of your analysis?

If I'm not mistaken the 2.5M is only used as a string and is used to slice the dataset, with quite strict expectations about the colnames used in the data object. It looks like you already changed some things, too. I'll guess that the input for this script is a larger datatable with multiple downsampled read counts for the same experiment and you use this code to perform differential expression analysis for each "downsampled dataset". But I don't know why you would want to do that.

In the future, try to be as informative as possible in your first post, tell us what you want to do, what doesn't work, show the code,...

ADD REPLY
0
Entering edit mode
7.5 years ago
novice ★ 1.1k

Easy. Here is how you do it.

You just take the script in R:

runs the data at sequencing depth 2.5M

and change it to:

runs the data at sequencing depth 30M instead

Hope that helps.

ADD COMMENT

Login before adding your answer.

Traffic: 2141 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6