Error with dba.count
1
2
Entering edit mode
2.2 years ago

Hello,

I am trying to analyze ENCODE ChIP-seq data with diffbind to find differential peaks. However, I got error in dba.count step: Error in SummarizedExperiment(assays = SimpleList(counts = countData), : the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the SummarizedExperiment object (or derivative) to construct

Can anyone help me with this? Thank you very much!

ChIP-seq diffbind ENCODE • 2.4k views
ADD COMMENT
0
Entering edit mode

Could you provide some more information?

What versions are you using (output of sessionInfo())?

What are the steps in your script, from creating a new DBA object up to and including calling dba.count()? Are you doing any type of normalization using spike-ins or parallel factors? Are you specifically requesting the data as a SummarizedExperiment object using DBA_DATA_SUMMARIZED_EXPERIMENT?

ADD REPLY
0
Entering edit mode

Thank you very much for your reply! Please find the following script. I am new to ChIP-seq analysis and Diffbind package, I haven't done any normalization yet, I just follow the manual of Diffbind, the nornmalization step in Diffbind manual is after counting reads.

samples <- read.csv("ESC_CH12_H3K27me3_mm10.csv")

ESC_CH12_H3K27me3.peak <- dba(sampleSheet=samples,bSummarizedExperiment = F)

ES_Bruce ES_Bruce Diff NT Full-Media 1 bed ES_Bruce ES_Bruce Diff NT Full-Media 2 bed CH12 CH12 Diff Diff Full-Media 1 bed CH12 CH12 Diff Diff Full-Media 2 bed

ESC_CH12_H3K27me3.count <- dba.count(ESC_CH12_H3K27me3.peak)

Computing summits... Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125 Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125 Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125 Sample: H3K27me3_mm10/bam/CH12_input_1.bam125 Sample: H3K27me3_mm10/bam/CH12_input_2.bam125 Re-centering peaks... Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_input_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_input_2.bam125 Reads will be counted as Single-end. Error in SummarizedExperiment(assays = SimpleList(counts = countData), : the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the SummarizedExperiment object (or derivative) to construct In addition: Warning messages: 1: In serialize(data, node$con) : 'package:stats' may not be available when loading 2: In serialize(data, node$con) : 'package:stats' may not be available when loading 3: In serialize(data, node$con) : 'package:stats' may not be available when loading 4: In serialize(data, node$con) : 'package:stats' may not be available when loading 5: In serialize(data, node$con) : 'package:stats' may not be available when loading 6: In serialize(data, node$con) : 'package:stats' may not be available when loading 7: In serialize(data, node$con) : 'package:stats' may not be available when loading 8: In serialize(data, node$con) : 'package:stats' may not be available when loading

sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] QuantitativeChIPseqWorkshop_0.1.2 csaw_1.28.0 BiocManager_1.30.16
[4] GreyListChIP_1.26.0 DiffBind_3.4.3 SummarizedExperiment_1.24.0
[7] Biobase_2.54.0 MatrixGenerics_1.6.0 matrixStats_0.61.0
[10] GenomicRanges_1.46.1 GenomeInfoDb_1.30.0 IRanges_2.28.0
[13] S4Vectors_0.32.3 BiocGenerics_0.40.0

loaded via a namespace (and not attached): [1] amap_0.8-18 colorspace_2.0-2 rjson_0.2.21 hwriter_1.3.2
[5] ellipsis_0.3.2 XVector_0.34.0 rstudioapi_0.13 ggrepel_0.9.1
[9] bit64_4.0.5 AnnotationDbi_1.56.2 fansi_0.5.0 mvtnorm_1.1-3
[13] apeglm_1.16.0 splines_4.1.2 cachem_1.0.6 geneplotter_1.72.0
[17] Rsamtools_2.10.0 annotate_1.72.0 ashr_2.2-47 png_0.1-7
[21] compiler_4.1.2 httr_1.4.2 assertthat_0.2.1 Matrix_1.3-4
[25] fastmap_1.1.0 limma_3.50.0 htmltools_0.5.2 tools_4.1.2
[29] coda_0.19-4 gtable_0.3.0 glue_1.6.0 GenomeInfoDbData_1.2.7
[33] systemPipeR_2.0.5 dplyr_1.0.7 ShortRead_1.52.0 Rcpp_1.0.8
[37] bbmle_1.0.24 vctrs_0.3.8 Biostrings_2.62.0 rtracklayer_1.54.0
[41] stringr_1.4.0 lifecycle_1.0.1 irlba_2.3.5 restfulr_0.0.13
[45] gtools_3.9.2 XML_3.99-0.8 edgeR_3.36.0 zlibbioc_1.40.0
[49] MASS_7.3-54 scales_1.1.1 BSgenome_1.62.0 parallel_4.1.2
[53] RColorBrewer_1.1-2 yaml_2.2.1 memoise_2.0.1 ggplot2_3.3.5
[57] emdbook_1.3.12 bdsmatrix_1.3-4 latticeExtra_0.6-29 stringi_1.7.6
[61] RSQLite_2.2.9 SQUAREM_2021.1 genefilter_1.76.0 BiocIO_1.4.0
[65] caTools_1.18.2 BiocParallel_1.28.3 truncnorm_1.0-8 rlang_0.4.12
[69] pkgconfig_2.0.3 bitops_1.0-7 lattice_0.20-45 invgamma_1.1
[73] purrr_0.3.4 GenomicAlignments_1.30.0 htmlwidgets_1.5.4 bit_4.0.4
[77] tidyselect_1.1.1 plyr_1.8.6 magrittr_2.0.1 DESeq2_1.34.0
[81] R6_2.5.1 snow_0.4-4 gplots_3.1.1 generics_0.1.1
[85] metapod_1.2.0 DelayedArray_0.20.0 DBI_1.1.2 pillar_1.6.5
[89] survival_3.2-13 KEGGREST_1.34.0 RCurl_1.98-1.5 mixsqp_0.3-43
[93] tibble_3.1.6 crayon_1.4.2 KernSmooth_2.23-20 utf8_1.2.2
[97] jpeg_0.1-9 locfit_1.5-9.4 grid_4.1.2 blob_1.2.2
[101] digest_0.6.29 xtable_1.8-4 numDeriv_2016.8-1.1 munsell_0.5.0

I aslo tried to set bSummarizedExperiment = TRUE in dba step, and got the following error,

ESC_CH12_H3K27me3 <- dba(sampleSheet=samples,bSummarizedExperiment = TRUE) ES_Bruce ES_Bruce Diff NT Full-Media 1 bed ES_Bruce ES_Bruce Diff NT Full-Media 2 bed CH12 CH12 Diff Diff Full-Media 1 bed CH12 CH12 Diff Diff Full-Media 2 bed Error in SummarizedExperiment(assays = assays, rowRanges = peaks, colData = meta) : the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the RangedSummarizedExperiment object (or derivative) to construct

Thank you very much for your help!

ADD REPLY
1
Entering edit mode

Just so we're on the same page, could you do a BiocManager::update() to update to the current versions?

Assuming the problem persists after the update, could you send me a link to your ESC_CH12_H3K27me3.peak object? I can look at it to see if three is anything unusual. Otherwise I may need access to your bam files to get to the bottom of this.

ADD REPLY
0
Entering edit mode

Thank you very much for your reply and suggestions! I tried update then re-run the analysis, but it is still not working. Please find the following link for the "ESC_CH12_H3K27me3.peak": https://drive.google.com/file/d/1Fc8nhLqKlCRrDqGxWAmZIz1IVE2S7QHG/view?usp=sharing

And the ENCODE bam file link is below: ES-Bruce4 https://www.encodeproject.org/files/ENCFF481UTW/@@download/ENCFF481UTW.bam https://www.encodeproject.org/files/ENCFF513YXS/@@download/ENCFF513YXS.bam CH12 https://www.encodeproject.org/files/ENCFF672MBQ/@@download/ENCFF672MBQ.bam https://www.encodeproject.org/files/ENCFF758OWH/@@download/ENCFF758OWH.bam

At the beginning I was thought maybe bed file format is incorrect, so I have tried to change the original ENCODE bed file to bed6, or bed file with 5 columns like the one in Diffbind manual, however, none of them work. I am not familiar with BAM format, I found there's people post other issues(not with Diffbind), which the problem is caused by BAM file has no header, I checked both the ENCODE bam file and the bam file in Diffbind vignette have header, so I don't know where is problem.

I really appreciate you taking time to help me! Thank you very much!

<h6>#</h6>

Following is the updated sessioninfo

sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)

Matrix products: default

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] DiffBind_3.4.7 SummarizedExperiment_1.24.0 Biobase_2.54.0
[4] MatrixGenerics_1.6.0 matrixStats_0.61.0 GenomicRanges_1.46.1
[7] GenomeInfoDb_1.30.1 IRanges_2.28.0 S4Vectors_0.32.3
[10] BiocGenerics_0.40.0

loaded via a namespace (and not attached): [1] bitops_1.0-7 RColorBrewer_1.1-2 numDeriv_2016.8-1.1 tools_4.1.2
[5] utf8_1.2.2 R6_2.5.1 irlba_2.3.5 KernSmooth_2.23-20
[9] DBI_1.1.2 colorspace_2.0-2 apeglm_1.16.0 tidyselect_1.1.1
[13] compiler_4.1.2 cli_3.1.1 DelayedArray_0.20.0 rtracklayer_1.54.0
[17] caTools_1.18.2 scales_1.1.1 SQUAREM_2021.1 mvtnorm_1.1-3
[21] mixsqp_0.3-43 stringr_1.4.0 digest_0.6.29 Rsamtools_2.10.0
[25] XVector_0.34.0 jpeg_0.1-9 pkgconfig_2.0.3 htmltools_0.5.2
[29] BSgenome_1.62.0 fastmap_1.1.0 invgamma_1.1 bbmle_1.0.24
[33] limma_3.50.0 htmlwidgets_1.5.4 rlang_1.0.1 rstudioapi_0.13
[37] BiocIO_1.4.0 generics_0.1.2 hwriter_1.3.2 BiocParallel_1.28.3
[41] gtools_3.9.2 dplyr_1.0.7 RCurl_1.98-1.5 magrittr_2.0.2
[45] GenomeInfoDbData_1.2.7 Matrix_1.3-4 Rcpp_1.0.8 munsell_0.5.0
[49] fansi_1.0.2 lifecycle_1.0.1 stringi_1.7.6 yaml_2.2.2
[53] MASS_7.3-54 zlibbioc_1.40.0 gplots_3.1.1 plyr_1.8.6
[57] grid_4.1.2 parallel_4.1.2 ggrepel_0.9.1 bdsmatrix_1.3-4
[61] crayon_1.4.2 lattice_0.20-45 Biostrings_2.62.0 locfit_1.5-9.4
[65] pillar_1.7.0 rjson_0.2.21 systemPipeR_2.0.5 XML_3.99-0.8
[69] glue_1.6.1 ShortRead_1.52.0 GreyListChIP_1.26.0 latticeExtra_0.6-29
[73] BiocManager_1.30.16 png_0.1-7 vctrs_0.3.8 gtable_0.3.0
[77] purrr_0.3.4 amap_0.8-18 assertthat_0.2.1 ashr_2.2-47
[81] ggplot2_3.3.5 emdbook_1.3.12 restfulr_0.0.13 coda_0.19-4
[85] truncnorm_1.0-8 tibble_3.1.6 GenomicAlignments_1.30.0 ellipsis_0.3.2

ADD REPLY
3
Entering edit mode
2.2 years ago
Rory Stark ★ 2.0k

It looks like you are using the same SampleID for multiple samples. Two samples share the same ID [ES_Bruce], and the other two share the same ID [CH12]. There is a requirement that each sample have a unique ID.

Try making each SampleID unique in your samplesheet. I'm adding an internal check to throw an error if an attempt is made to use duplicate IDs.

ADD COMMENT
2
Entering edit mode

It works after changing SampleID to unique. Thank you very much for your help!

ADD REPLY
1
Entering edit mode

This fix also worked for me!

ADD REPLY

Login before adding your answer.

Traffic: 2219 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6