Question: [diffbind] error when running dba.count
0
gravatar for Kathy
29 days ago by
Kathy0
Kathy0 wrote:

I am running diffbind and has built up the dba object. When running dba.count(), there are few samples failed to process and encountered the following error message --

[W::bam_hdr_read] bgzf_check_EOF: No error

[W::bam_hdr_read] bgzf_check_EOF: No error

[E::bgzf_read] Read block operation failed with error -1 after zd of zu bytes

Anyone has any advices? Thank you in advance!

chip-seq R • 152 views
ADD COMMENTlink modified 23 days ago • written 29 days ago by Kathy0
0
gravatar for jared.andrews07
29 days ago by
jared.andrews074.1k
St. Louis, MO
jared.andrews074.1k wrote:

This generally indicates that there's an issue with your BAM files, usually that they are truncated and missing a special block at the end of the file that marks, well, the end of the file.

You can check your bam files with picard ValidateSamFile, which will help you pinpoint whether this is the issue or not. I don't actually think this is a DiffBind issue.

ADD COMMENTlink written 29 days ago by jared.andrews074.1k

That's what I thought as well. So I did re-run the parallel alignments for all the samples, and still 4 of 14 samples failed in diffbind dba.count() with the same error.

I also did picard as you suggested, it seemed all the 14 samples got similar error as below (numbers vary) --

Error Type Count

ERROR:MATE_NOT_FOUND 202699

ERROR:MISSING_READ_GROUP 1

WARNING:RECORD_MISSING_READ_GROUP 9050819

However, 10 samples are okay with dba.count() and only 4 of them failed.

ADD REPLYlink modified 22 days ago • written 28 days ago by Kathy0

How are you aligning? Is everything single-end?

ADD REPLYlink written 28 days ago by jared.andrews074.1k

They are paired ends, I used bowtie2.

I've processed many other experiments using same workflow for diffbind. This is the first time I got this error.

ADD REPLYlink modified 28 days ago • written 28 days ago by Kathy0
0
gravatar for Kathy
23 days ago by
Kathy0
Kathy0 wrote:

I realized that those bgzf read were derived from bgzf.c (in htslib). Is there any way to avoid to run bgzf.c or htslib inside of dba.count()?

Those 4 failed bam files were okay to run other program such as peak calling, etc. I wonder whether those errors not due to bam file itself.

ADD COMMENTlink modified 22 days ago • written 23 days ago by Kathy0

This might be a question better asked on the Bioconductor support site. The devs have to answer you there (or the Bioc maintainers will get on their case), and they will likely have a better idea of why this is occurring than any of us.

ADD REPLYlink written 23 days ago by jared.andrews074.1k

Thank you. I posted at bioc site and the problem resolved. Here is the link https://support.bioconductor.org/p/126303/#126472

ADD REPLYlink modified 17 days ago • written 17 days ago by Kathy0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1889 users visited in the last hour