Question: Chip-seq - fastqc: zip file doesn't open
0
gravatar for salamandra
2.6 years ago by
salamandra220
salamandra220 wrote:

I have an ubuntu 16.04 LTS OS and have installed the software FastQC from the site: http://www.bioinformatics.babraham.ac.uk/projects/download.html.

when I try to open a .fastq file with FastQC as part of a pipeline: fastqc 'path_to_file/file.fastq' a zip folder with the results is produced in the same directory of fastq file, but throws the error: 'An error occurred while loading the archive' when I try to open the file interactively (by typing fastqc in command line and then click File -> Open) it uploads till 95% and then stops.

Could someone please explain what am I doing wrong? Is it related with the installation?

Note: java version is openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.04.2-b14) OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)

fastqc chip-seq • 2.3k views
ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by salamandra220

Are you trying to open the "results zip" file in FastQC or the original sequence?

Results zip file is intended for off-line usage e.g. if you want to plot nucleotide distribution using R or some other program. You should only check the sample.html file to view FastQC results in a normal web browser.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax68k

I can't find any file sample.html. Where is it suppose to be?

ADD REPLYlink written 2.6 years ago by salamandra220

When you run fastqc (e.g. fastqc sample.fq.gz) there should be two result files produced in the same folder (unless you use -o option to put them in a different directory). One file should be sample.zip and other should be sample.html. Are you running fastqc on command line or using the GUI?

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax68k

I tried both, with command line, can't get the sample.html file. with GUI loads the input file till 95% only, it doesn't open.

ADD REPLYlink written 2.6 years ago by salamandra220

It sounds like you are running out of memory. How big is the sequence file and how much memory do you have?

ADD REPLYlink written 2.6 years ago by genomax68k

sequence file is 185,2 MB free memory 130944 KB

ADD REPLYlink written 2.6 years ago by salamandra220

Both of those numbers sound odd. Have you manged to get FastQC to produce a report before or is this the first time you are using it?

ADD REPLYlink written 2.6 years ago by genomax68k

it's the first time i'm using..

ADD REPLYlink written 2.5 years ago by salamandra220

Can you try a small test file (this is test data from EBI-ENA ) to see if we can make sure you can get FastQC working right? Don't uncompress the sequence file after you download (it is ~27MB).

Run the analysis on command line fastqc ERR385893_1.fastq.gz.

This should produce

Started analysis of ERR385893_1.fastq.gz
Approx 5% complete for ERR385893_1.fastq.gz
Approx 10% complete for ERR385893_1.fastq.gz
--some lines removed--
Approx 95% complete for ERR385893_1.fastq.gz
Analysis complete for ERR385893_1.fastq.gz

This should not take longer than a few minutes. You should then get two result files. Open the .html file using a browser to see the results.

352K ERR385893_1_fastqc.html
432K ERR385893_1_fastqc.zip
 27M ERR385893_1.fastq.gz
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax68k

It didn't work

After the 95% complete it printed:

Approx 95% complete for ERR385893_1.fastq.gz Analysis complete for ERR385893_1.fastq.gz Failed to process file ERR385893_1.fastq.gz java.lang.IllegalArgumentException: No key called gc_sequence:ignore in the config data at uk.ac.babraham.FastQC.Modules.ModuleConfig.getParam(ModuleConfig.java:148) at uk.ac.babraham.FastQC.Modules.PerSequenceGCContent.ignoreInReport(PerSequenceGCContent.java:57) at uk.ac.babraham.FastQC.Report.HTMLReportArchive.startDocument(HTMLReportArchive.java:331) at uk.ac.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:84) at uk.ac.babraham.FastQC.Analysis.OfflineRunner.analysisComplete(OfflineRunner.java:155) at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:110) at java.lang.Thread.run(Thread.java:745)

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by salamandra220
1

It looks like you either have an incomplete or corrupt install of FastQC.

Did you download the FastQC v0.11.5 (Win/Linux zip file)? You may want to do this again.

ADD REPLYlink written 2.5 years ago by genomax68k

After re-installing FastQC and running the test sample you gave didnt show an error. But I can't find the file ERR385893_1_fastqc.html anyware. Maybe it didn't produce the file, as it says: "Failed to process file ERR385893_1.fastq.gz"

It printed this: Approx 95% complete for ERR385893_1.fastq.gz Analysis complete for ERR385893_1.fastq.gz Failed to process file ERR385893_1.fastq.gz java.lang.IllegalArgumentException: No key called gc_sequence:ignore in the config data at uk.ac.babraham.FastQC.Modules.ModuleConfig.getParam(ModuleConfig.java:148) at uk.ac.babraham.FastQC.Modules.PerSequenceGCContent.ignoreInReport(PerSequenceGCContent.java:57) at uk.ac.babraham.FastQC.Report.HTMLReportArchive.startDocument(HTMLReportArchive.java:331) at uk.ac.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:84) at uk.ac.babraham.FastQC.Analysis.OfflineRunner.analysisComplete(OfflineRunner.java:155) at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:110) at java.lang.Thread.run(Thread.java:745)

ADD REPLYlink written 2.2 years ago by salamandra220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1442 users visited in the last hour