run FASTQC on linux terminal as java script
2
1
Entering edit mode
6.6 years ago

Hello, I am trying to use FASTQC to assess the quality of RNA-seq data. I have installed FASTQC using:

sudo apt-get update
sudo apt-get install fastqc

and now when i type fastqc -version I get: FastQC v0.11.4. I am using linux ubuntu 16.04 thus java is already installed and when I type java -version I get:

openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

I have a fastq file (SRR390728_1.fastq). The file is downloaded from the public domain and looks genuine, with the first lines being:

> @SRR390728.1 1 length=36 CATTCTTCACGTAGTTCTCGAGCCTTGGTTTTCAGC
> +SRR390728.1 1 length=36 ;;;;;;;;;;;;;;;;;;;;;;;;;;;9;;665142 @SRR390728.2 2 length=36 AAGTAGGTCTCGTCTGTGTTTTCTACGAGCTTGTGT
> +SRR390728.2 2 length=36 ;;;;;;;;;;;;;;;;;4;;;;3;393.1+4&&5&& @SRR390728.3 3 length=36 CCAGCCTGGCCAACAGAGTGTTACCCCGTTTTTACT
> +SRR390728.3 3 length=36
> -;;;8;;;;;;;,*;;';-4,44;,:&,1,4'./&1 @SRR390728.4 4 length=36 ATAAAATCAGGGGTGTTGGAGATGGGATGCCTATTT
> +SRR390728.4 4 length=36 1;;;;;;,;;4;3;38;8%&,,;)*;1;;,)/%4+,

When I run:

fastqc --extract  SRR390728_1.fastq

I obtain:

java.io.FileNotFoundException: /etc/fastqc/Configuration/adapter_list.txt (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at java.io.FileInputStream.<init>(FileInputStream.java:93)
    at uk.ac.babraham.FastQC.Modules.AdapterContent.<init>(AdapterContent.java:75)
    at uk.ac.babraham.FastQC.Modules.ModuleFactory.getStandardModuleList(ModuleFactory.java:37)
    at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:134)
    at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:102)
    at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
Started analysis of SRR390728_1.fastq
java.io.FileNotFoundException: /etc/fastqc/Configuration/limits.txt (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at java.io.FileInputStream.<init>(FileInputStream.java:93)
    at uk.ac.babraham.FastQC.Modules.ModuleConfig.readParams(ModuleConfig.java:87)
    at uk.ac.babraham.FastQC.Modules.ModuleConfig.<clinit>(ModuleConfig.java:35)
    at uk.ac.babraham.FastQC.Modules.PerTileQualityScores.processSequence(PerTileQualityScores.java:174)
    at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:88)
    at java.lang.Thread.run(Thread.java:748)
Approx 5% complete for SRR390728_1.fastq
Approx 10% complete for SRR390728_1.fastq
Approx 15% complete for SRR390728_1.fastq
Approx 20% complete for SRR390728_1.fastq
Approx 25% complete for SRR390728_1.fastq
Approx 30% complete for SRR390728_1.fastq
Approx 35% complete for SRR390728_1.fastq
Approx 40% complete for SRR390728_1.fastq
Approx 45% complete for SRR390728_1.fastq
Approx 50% complete for SRR390728_1.fastq
Approx 55% complete for SRR390728_1.fastq
Approx 60% complete for SRR390728_1.fastq
Approx 65% complete for SRR390728_1.fastq
Approx 70% complete for SRR390728_1.fastq
Approx 75% complete for SRR390728_1.fastq
Approx 80% complete for SRR390728_1.fastq
Approx 85% complete for SRR390728_1.fastq
Approx 90% complete for SRR390728_1.fastq
Approx 95% complete for SRR390728_1.fastq
Analysis complete for SRR390728_1.fastq
Failed to process file SRR390728_1.fastq
java.lang.IllegalArgumentException: No key called gc_sequence:ignore in the config data
    at uk.ac.babraham.FastQC.Modules.ModuleConfig.getParam(ModuleConfig.java:148)
    at uk.ac.babraham.FastQC.Modules.PerSequenceGCContent.ignoreInReport(PerSequenceGCContent.java:57)
    at uk.ac.babraham.FastQC.Report.HTMLReportArchive.startDocument(HTMLReportArchive.java:331)
    at uk.ac.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:84)
    at uk.ac.babraham.FastQC.Analysis.OfflineRunner.analysisComplete(OfflineRunner.java:155)
    at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:110)
    at java.lang.Thread.run(Thread.java:748)

I obtain a zip file (6.5 kb) that cannot be opened (archive manager gives the error: 'An error occurred while loading the archive'). Same thing if I don't use the '--extract' option; if I use the interactive method, the program gets stuck at 95% processing.

What went wrong? looks like FASTQC does not like my java distro.

Please note that 'fastqc' is a perl script in /usr/bin/ and in the same folder, 'java' is a link to '/etc/alternatives/java'.

Thank you

RNA-Seq • 20k views
ADD COMMENT
1
Entering edit mode

That doesn't look like a valid fastq file to me, or your example is malformed after you posted it here.

ADD REPLY
0
Entering edit mode

the '>' comes from the formatting within this page, the structure is more like:

@SRR390728.1 1 length=36
CATTCTTCACGTAGTTCTCGAGCCTTGGTTTTCAGC
+SRR390728.1 1 length=36
;;;;;;;;;;;;;;;;;;;;;;;;;;;9;;665142
@SRR390728.2 2 length=36
AAGTAGGTCTCGTCTGTGTTTTCTACGAGCTTGTGT
+SRR390728.2 2 length=36
;;;;;;;;;;;;;;;;;4;;;;3;393.1+4&&5&&
@SRR390728.3 3 length=36
CCAGCCTGGCCAACAGAGTGTTACCCCGTTTTTACT

etc. I downloaded it with the command: prefetch SRR390728 and extracted with fastq-dump --split-files ./SRR390728.sra; in this example, I am using only one of the two files generated.

ADD REPLY
0
Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Relevant lines from OP error:

java.io.FileNotFoundException: /etc/fastqc/Configuration/adapter_list.txt (No such file or directory)
java.io.FileNotFoundException: /etc/fastqc/Configuration/limits.txt (No such file or directory)

Download the files from: https://github.com/ENCODE-DCC/file-validation-pipeline/tree/master/dnanexus/fastqc-exp/resources/usr/bin/FastQC/Configuration or from here: http://web.cbio.uct.ac.za/~emile/AGe/AGe_NGS/soft/FastQC/Configuration/ or wherever you could find missing files.

Keep them in /etc/fastqc/Configuration/ and re-run fastqc. If you are not sure to replace these files from internet, purge the package, download the binary from fastqc website or setup brew on your linux machine and do brew install fastqc You should have checked if there are files in the said folders or checked the installed files either by CLI (sudo dpkg -L fastqc) or via synaptic.

ADD REPLY
3
Entering edit mode
6.6 years ago

The method you used for installing FastQC is not the one recommended by the authors: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/INSTALL.txt

ADD COMMENT
0
Entering edit mode

That was it! I downloaded the distro from here, changed the mode of the perl file with chmod 755 fastqc and it does work perfectly. Thank you! (this tip is instead obviously wrong)

ADD REPLY
2
Entering edit mode
6.6 years ago
h.mon 35k

This is a known and reported bug with Ubuntu packaging of FastQC - and this question keeps arising again and again. As Wouter and you noted, FastQC comes in a self-contained download which is really easy to install.

ADD COMMENT
0
Entering edit mode

Boa tarde, obrigado / thanks for the information on that (I use Ubuntu 14.04). I rarely install these tools using Ubuntu Aptitude because it's also difficult to version control.

ADD REPLY

Login before adding your answer.

Traffic: 2399 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6