Files recognized as gzip but not compressed
1
0
Entering edit mode
5 weeks ago
beaferbl ▴ 10

Hi everyone! I'm new to Linux, working with Ubuntu 20.04 in a virtual machine (Virtual Box 6.1). I have a shared folder in my Windows desktop, so I can work with it in the machine. I have a problem wirh BAM files: they are recognized as gzip but they do not have extension .zip or similar. Anyway, when I try to extract them I have the message: an error occurred while extracting files. If I add the .gz extension I get to extract them but they are higher and give error with other applications such as IGV. In Windows they work fine, as well as in a computer with Linux as operative system. So I think it is a problem of the virtual machine. Anyone knows how can I solve this? Thank you.

Linux gzip BAM Ubuntu • 170 views
1
Entering edit mode

BAM files are compressed versions of the SAM format but the compression is not plain gzip, and there is actually no reason I could think of that one would ever decompress them. If so, then one would use specialized software, most commonly samtools. What do you want to do with them?

1
Entering edit mode

Instead of (g)unzipping, try converting them to sam format which can be read with any text editor. But do not use notepad or word pad on windows to open sam files. Use notepad++. Is there any reason why you want to (g)unzip bam? If you want to subset/extract a region from the bam, there are several tools that can subset the bam.

0
Entering edit mode

Thank you all for your answers! I don't really want to decompress the files, I just did it because I was having problems running rMATS. It was strange for me to see BAM files recognized as gunzip, so I thought there was a problem with the BAM files. Loading them on IGV was just a way to check if the files were fine. Now IGV works (I don't know what was the problem before). Now that I know that BAM files are gunzip, the problem with rMATS must have another explanation.

2
Entering edit mode
5 weeks ago

BAM file are (b-)gzipped and have the extension ".bam" or ".cram".

There is no reason in the world you would have to gunzip them.

The BGZF (compatible with gzip) format allows the file to be compact and random-accessed.

IGV wants a BAM bgzipped and associated with an index (.bai)