Hey!
I am trying to run velocyto on my BAM file, however I was getting some errors that implied a truncation or corruption of the BAM file. I tested that this wasn't velocyto related by running samtools index on the file, which achieved a highly similar error (below), so it's most likely the file itself which is the problem.
samtools index WT_UT_possorted_genome_bam.bam
[E::bgzf_uncompress] Inflate operation failed: progress temporarily not possible, or in() / out() returned an error
[E::bgzf_read] Read block operation failed with error 1 after 0 of 4 bytes
samtools index: failed to create index for "WT_UT_possorted_genome_bam.bam"
I've tried some suggested approaches for checking the state of my BAM including samtools quickcheck -v (seems fine) and manually checking for the correct BAM footer (matches what it should look like):
tail -c 28 WT_UT_possorted_genome_bam.bam | hexdump -C
00000000 1f 8b 08 04 00 00 00 00 00 ff 06 00 42 43 02 00 |............BC..|
00000010 1b 00 03 00 00 00 00 00 00 00 00 00 |............|
0000001c
I would ask is the file definitely corrupted, and if not is there any way to check further or fix the file?
Poster suggested solutions:
samtools quickcheck -qvvv WT_UT_possorted_genome_bam.bam
verbosity set to 3
checking WT_UT_possorted_genome_bam.bam
opened WT_UT_possorted_genome_bam.bam
WT_UT_possorted_genome_bam.bam is sequence data
WT_UT_possorted_genome_bam.bam has 66 targets in header.
WT_UT_possorted_genome_bam.bam has good EOF block.
Just displaying command and error content (several lines of BAM output follow this error):
samtools view WT_UT_possorted_genome_bam.bam | tail
[E::bgzf_uncompress] Inflate operation failed: progress temporarily not possible, or in() / out() returned an error
[E::bgzf_read] Read block operation failed with error 1 after 0 of 4 bytes
[main_samview] truncated file.
Got the similar errors when running
samtools depth
, also thequickcheck
didn't show any issues.Maybe my bam file is too large (139G) ?
It means the file is corrputed, this is unrelated to size.
Yeah I think it really is... I tried one more time and samtools stoped at the same place
I may need to redo the alignment...
Run
samtools quickcheck -qvvv your.bam
.Added to post, looks good
Hmm, based on qhat you posted the file looks ok based on https://github.com/samtools/samtools/issues/785 . You could do
samtools view testWT_UT_possorted_genome_bam.bam | tail
which then should give you the last line of the non-corrupted part of the file. Then inspect the n+1st line and see what is going on.Added to post, view is also having issues, but not clear why. Thanks for the input btw!
Strange. I suggest you recreate that file. Might be the simpler workaround than digging out the cause of that error. I guess that if one of the maintainers of samtools in https://github.com/samtools/samtools/issues/785 did not have a good response we will not find one either.
I normally would, hoped not to in this case as this bam is linked to an output that already has already been through a lot of processing. I'll have a think about a way around it. Thanks again for the input!
Would you happen to have an update on this? Running into the same problem and the bam files themself seem fine.
If these errors come up how can the file be fine?
They've all been downloaded from TCGA with a lot of post-processing that worked on them. I'm trying to use diffbind for some count matrices, so I'm super confused about why they're showing up as truncated now :/
Probably something went wrong during the "lot of post-processing". Look, if you need help then please open a new question and include all relevant code. Anecdotal descriptions are rarely fruitful.