Hi all! I'm trying to download the FASTQ files from different experiments stored in ENA Database in a automatized way just having the SRR codes of the sample.
I'm using ENA File Downloader Command Lines Tool and I'm able to download almost all the FASTQ that I want in a automaticed way. But, when I try to uncompressed it with gunzip I obtained the following problem:
gunzip: /direction_to_file/file.fastq.gz: invalid compressed data--format violated
I have downloaded the same fastq in a manual way from the webpage and I can uncompressed it without any problem. I have compared the sizes of the 2 fastq files. And I see that they have mostly the same size but a little bit different:
- size manually way downloaded: 550306224 Bytes
- size automatically downloaded: 550314301 Bytes.
Do you know what could have happened here? I have tried different samples and I obtained the same error.
I am using linux and the command line that I have used to download the sample automatically was the following:
java -jar path_to_ena_file_downloader/ena-file-downloader.jar --accessions=SRR6435746 --format=READS_FASTQ --location=path_to_Fastq_downloaded_folder --protocol=FTP --email=NONE
Thank you very much in advance
Which accession number are we looking at here? It is possible that the download may either be corrupt at source or got corrupted locally (can happen if you are behind a proxy/firewall).
I tried it with accession number SRR6435746, but I tried with other accession numbers and I had the same problem. If the download is been corrupted locally what can I do to avoid it and solve the problem?
You can retry at least once. If not try to download from SRA.