Creating Double The Columns Than There Are Samples: DEXseq
0
0
Entering edit mode
8 months ago
Y • 0

I am trying to use DEXseq and I was told that I could output normalized counts using the following method by my supervisor:

library("DEXSeq")
# Create the DEXSeqDataSet object
dxd <- DEXSeqDataSetFromHTSeq(
  countsFiles,
  sampleData=sampleTable,
  design= ~ sample + exon + condition:exon,
  flattenedfile=flattenedFile )

#Normalize

normFactors <- matrix(runif(nrow(dxd)*ncol(dxd),0.5,1.5),
                      ncol=ncol(dxd),nrow=nrow(dxd),
                      dimnames=list(1:nrow(dxd),1:ncol(dxd)))

normFactors <- normFactors / exp(rowMeans(log(normFactors)))
normalizationFactors(dxd) <- normFactors

dxd = estimateSizeFactors( dxd )

###pairs
normalizedCounts <- t( t(counts(dxd)) / sizeFactors(dxd) )

# Write a table
write.table(normalizedCounts, "normalizedDexSeq.txt", sep="\t", row.names=T)

My session info is below if required:

R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5

Matrix products: default
BLAS:   /path/to/libRblas.0.dylib 
LAPACK: /path/to/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] C/UTF-8/C/C/C/C

time zone: -
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] GenomicAlignments_1.36.0    Rsamtools_2.16.0           
 [3] Biostrings_2.68.1           XVector_0.40.0             
 [5] DEXSeq_1.46.0               RColorBrewer_1.1-3         
 [7] DESeq2_1.40.2               SummarizedExperiment_1.30.2
 [9] MatrixGenerics_1.12.3       matrixStats_1.0.0          
[11] BiocParallel_1.34.2         GenomicFeatures_1.52.1     
[13] AnnotationDbi_1.62.2        Biobase_2.60.0             
[15] GenomicRanges_1.52.0        GenomeInfoDb_1.36.1        
[17] IRanges_2.34.1              S4Vectors_0.38.1           
[19] BiocGenerics_0.46.0        

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0        IRdisplay_1.1           dplyr_1.1.2            
 [4] blob_1.2.4              filelock_1.0.2          bitops_1.0-7           
 [7] fastmap_1.1.1           RCurl_1.98-1.12         BiocFileCache_2.8.0    
[10] XML_3.99-0.14           digest_0.6.33           lifecycle_1.0.3        
[13] statmod_1.5.0           survival_3.5-7          KEGGREST_1.40.0        
[16] RSQLite_2.3.1           genefilter_1.82.1       magrittr_2.0.3         
[19] compiler_4.3.1          rlang_1.1.1             progress_1.2.2         
[22] tools_4.3.1             utf8_1.2.3              yaml_2.3.7             
[25] rtracklayer_1.60.0      prettyunits_1.1.1       S4Arrays_1.0.5         
[28] bit_4.0.5               curl_5.0.2              DelayedArray_0.26.7    
[31] xml2_1.3.5              repr_1.1.6              abind_1.4-5            
[34] pbdZMQ_0.3-9            hwriter_1.3.2.1         grid_4.3.1             
[37] fansi_1.0.4             xtable_1.8-4            colorspace_2.1-0       
[40] ggplot2_3.4.3           scales_1.2.1            biomaRt_2.56.1         
[43] cli_3.6.1               crayon_1.5.2            generics_0.1.3         
[46] httr_1.4.7              rjson_0.2.21            DBI_1.1.3              
[49] cachem_1.0.8            stringr_1.5.0           splines_4.3.1          
[52] zlibbioc_1.46.0         parallel_4.3.1          restfulr_0.0.15        
[55] base64enc_0.1-3         vctrs_0.6.3             Matrix_1.6-1           
[58] jsonlite_1.8.7          geneplotter_1.78.0      hms_1.1.3              
[61] bit64_4.0.5             locfit_1.5-9.8          annotate_1.78.0        
[64] glue_1.6.2              codetools_0.2-19        gtable_0.3.3           
[67] stringi_1.7.12          BiocIO_1.10.0           munsell_0.5.0          
[70] tibble_3.2.1            pillar_1.9.0            rappdirs_0.3.3         
[73] htmltools_0.5.6         IRkernel_1.3.2          GenomeInfoDbData_1.2.10
[76] R6_2.5.1                dbplyr_2.3.3            evaluate_0.21          
[79] lattice_0.21-8          png_0.1-8               memoise_2.0.1          
[82] Rcpp_1.0.11             uuid_1.1-0              pkgconfig_2.0.3

However, when I get the table and open it in Excel I find the number of columns is double the number of samples being processed. I am processing 6 samples. 3 experimental samples and 3 control samples. But I have 12 rows not including the Ensembl ID column. What do each of these 12 rows stand for as I only have 6 samples?

R Jupyter DEXseq • 298 views
ADD COMMENT

Login before adding your answer.

Traffic: 1702 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6