Fastqc Html Report To Pdf (With A Script)
7
14
Entering edit mode
10.3 years ago
Caddymob ▴ 1000

Hey there,

Anybody have a solution to convert fastQC html output to a pdf? If you've got 50 or 500 FASTQs to check, and more importantly share via email, the hmtl output is a little clunky to deal with.

Pierre Lindenbaum suggested apache FOP on twitter. I tried it but it seems it needs an xlt style sheet to work.

I also tried wkhtmltopdf on my linux machine and wkpdf on my macbook. Both of these resulted in blank PDFs.

Opening each html one by one and printing to PDF on my mac works, but is a really really slow option.

Point is I want to script this out, ideally in linux, and get a bunch of PDFs in the end.

Posted below are two links with test data, the first shows what the output looks like in a browser, the second is the full output from fastQC.

Thanks!

fastQC example page

zip download of output

fastqc script • 9.9k views
ADD COMMENT
1
Entering edit mode

Why, in the first place, does non-interactive FastQC output come in HTML ? - is something I think I'll never understand.

ADD REPLY
7
Entering edit mode
10.3 years ago
Neilfws 49k

Surprised no-one has mentioned HTMLDOC. For Ubuntu and similar simply:

sudo apt-get install htmldoc

then:

htmldoc --webpage -f output.pdf index.html

or just "htmldoc" for the GUI.

ADD COMMENT
1
Entering edit mode

Yes, this works nicely too! Played with the options to make things fit better, running this htmldoc --webpage --browserwidth 800 --fontsize 7 -f output.pdf fastqc_report.html

ADD REPLY
0
Entering edit mode

I installed this in Mac OS Mojave (10.14.13) and it works but the output is all black and white and without the plots :/.

ADD REPLY
6
Entering edit mode
10.3 years ago

I just installed wkhtmltopdf and used it on your html file with this command:

wkhtmltopdf 20A.R2.QC.fq_fastqc/fastqc_report.html test.pdf

And I got a test.pdf file back with the correct contents. Here is the pdf file uploaded to imgur: http://imgur.com/f4fCz

Imgur converted the .pdf to .png so the quality is not great.

ADD COMMENT
1
Entering edit mode

I get a QPixmap: Cannot create a QPixmap when no GUI is being used error. Seems this is a bug. Running on a 64 centOS machine. Curious what you ran it on?

ADD REPLY
0
Entering edit mode

Thats crazy. I do that and I get a header and nothing else: http://public.tgen.org/jcorneveaux/FASTQC/test.pdf

ADD REPLY
0
Entering edit mode

I ran it on Ubuntu 11.04 64 bit.

ADD REPLY
6
Entering edit mode
10.3 years ago
Rm 8.1k

This is the script qcimg2pdf.sh) i use as part of Fastq workflow. I use some of the images from fastqc: Run it in the fastqc parent directories for different lanes....

#!/bin/bash
## qcimg2pdf.sh
echo "Usage: $0 -o output_prefix";

use ghostscript-9.02  ## if already exists in path comment this Line

if [[ $# -eq 0 || $# -gt 2 ]]
then
echo "No/wrong ($#) arguments detected "
echo "Run it where you have *fastqc directories";
exit 1 #exit shell script
fi

while getopts o: option
do
case $option in
o)
outprefix=$OPTARG
;;
esac
echo $outprefix;

if [[ $outprefix != "" ]];then
for j in `ls -d1 *fastqc` ;
 do
  echo $j ;

  convert \( -scale 500x500 $j/Images/per_base_quality.png $j/Images/per_base_gc_content.png +append \) \( -scale 500x500 $j/Images/per_sequence_quality.png $j/Images/per_sequence_gc_content.png +append \) -append -font Helvetica -pointsize 12 -gravity northeast -draw "translate +5,+5 text 80,80 '`grep -A5 Filename $j/fastqc_data.txt`'" QC.$j.pdf

 done

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=qc-lanes.$outprefix.pdf QC.*.pdf

else
echo "use correct arguments with only -o "
exit 1 #exit shell script
fi

done
ADD COMMENT
0
Entering edit mode

Brilliant solution, and just the simple kind of approach I needed, thx RM!

ADD REPLY
0
Entering edit mode

I'm not sure I understand how this works. What arguments do you need to give the script?

ADD REPLY
0
Entering edit mode

In my case I ran it in Linux and it produced a pdf with three plots, it didnĀ“t convert properly the whole report into PDF.

ADD REPLY
3
Entering edit mode
10.3 years ago

I would be interested in extracting the raw FastQC data as generic tables to render plots in R

If anyone would like to participate I could create a github repository for this "project".

ADD COMMENT
2
Entering edit mode

I think the fastqc_data.txt file has much of it..

ADD REPLY
2
Entering edit mode

I did this a while ago, but never maintained it - you're free to cannibalise as much as you want/need! https://github.com/clark-lab-robot/Repitools-git/blob/master/pkg/Repitools/R/FastQC-class.R

ADD REPLY
1
Entering edit mode

There is also a bioc package called qrqc that will get you some of the fastqc stats as well. The nice thing about the package is that it does all the read processing "online" (you don't have to load the entire thing) in C code, like fastqc does.

ADD REPLY
0
Entering edit mode

Yes, been testing qrqc too..

ADD REPLY
0
Entering edit mode

Thanks Aaron, looks like some good stuff :)

ADD REPLY
3
Entering edit mode
10.3 years ago

Here is the XSLT stylesheet for FO:

the HTML document is not a valid XML document so I used xsltproc to fix the document before using FOP. Here is the Makefile:

all:fastqc.pdf

fastqc.fo:fastqc2fo.xsl fastqc_report.html
    xsltproc --html fastqc2fo.xsl fastqc_report.html > $@

fastqc.pdf:fastqc.fo
    fop  $< $@

The result was posted on slideshare: http://www.slideshare.net/lindenb/biostar17037

Edit: my output is missing one or two tables but you get the idea.

ADD COMMENT
2
Entering edit mode
10.3 years ago
Tyler Moore ▴ 20

My company, Expected Behavior, has a service called DocRaptor that converts HTML to PDF or Excel format. Unlike wkhtmltopdf, DocRaptor generates fully functional PDF files, not just a PNG.

Here's a link to the home page:

http://docraptor.com/

And a link to the code example page. You make an HTTP POST request to DocRaptor's server, and we send your file back to you.

http://docraptor.com/examples

ADD COMMENT
1
Entering edit mode
10.3 years ago

I just tried html to latex and after a few minutes I decided that you are right: must be an easier way.

@DK's approach seems the best.

http://htmltolatex.sourceforge.net/

ADD COMMENT
0
Entering edit mode

yea... I just wish it actually worked for me..

ADD REPLY

Login before adding your answer.

Traffic: 975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6