FAIRE-seq - actively maintained peak-callers
1
2
Entering edit mode
4.2 years ago
boczniak767 ▴ 740

Hi all,

are there any actively updated software suitable for FAIRE-seq (or DNAse-seq, which I believe poses similar problems) peak-calling?

I think I did 'homework' and it seems that MACS2 along with IDR is used quite frequently, but not all agrees it could handle broad peaks properly. I've used it and get quite reasonable peak numbers (usually tens, or hundreds thousands).

I've also used HOMER using -style factor option, number of peaks are also reasonable (it depends of course on p/q-value used).

Installing/running ZINBA or F-seq failed at some point and it seems that both are abandoned.

What are your suggestions? Are you aware of any 'fresh' package for FAIRE/DNA-se seq? Allowing biological replication if possible.

Also, what is your view of summarizing biological replicates? Is IDR correct? How about comparing samples, are you seek unique peaks or take quantitative point of view and compare read number corresponding to shared peaks?

Best Regards, Maciej

ChIP-Seq FAIRE-seq peaks • 1.5k views
0
Entering edit mode

My 2p about zinba and F-Seq. Zinba: I also had troubles and not enough patience to make it work. I was happy with F-Seq, which worked fine out of the box and it gave reasonable results.

0
Entering edit mode

I was trying F-seq without success, it installed correctly (as command line says) but didn't run. I'll try it on a different computer.

This is old package (It seems that last significant commit occurred ten years ago). The problem with not updated packages is that it may require old java/python/perl/etc... versions. So installing a bunch of different programs could ruin the computer.

0
Entering edit mode

The nice thing about F-Seq is that it's Java with a shell script as interface. So it should run on virtually any *nix machine without additional dependencies. What problems do you see?

(But yes, the statement "it has no dependencies" fails more often than it should...!)

0
Entering edit mode

It is problem of nature I mentioned, I have openjdk (icedtea) and F-seq requires other JAVA version (I suppose from oracle)

BUILD FAILED /home/mj/bin/F-seq/build.xml:59: Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to "/usr/lib64/jvm/java-1.8.0-openjdk-1.8.0/jre"

2
Entering edit mode

It seems to me that you are trying to compile F-Seq from source but you don't have a Java compiler (javac) for that.

However, you don't need to compile F-Seq yourself, just get the compiled version from http://fureylab.web.unc.edu/software/fseq/ (see download link for v 1.84). The downloaded zip file contains the compiled version of fseq, the shell script to run it, and a README file with some instructions.

If the command java -version doesn't throw any error and shows a version greater than 1.5 (probably yes, 1.5 is quite old), then you should be fine.

0
Entering edit mode

The installation from git repo using ant was successful (but with some warnings, not looking serious):

[javac] warning: [options] bootstrap class path not set in conjunction with -source 1.5

[javac] warning: [options] source value 1.5 is obsolete and will be removed in a future release

[javac] warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.

But trying to run it gives: Error: Could not find or load main class edu.duke.igsp.gkde.Main

0
Entering edit mode

v 1.84 works! But it gives incredibly high number of peaks, maybe using background directory would help. But how to convert fasta files with chromosome sequences to wig? (to be converted further to bff by bffBuilder)?

0
Entering edit mode

The high number of peaks, in the order of hundreds of thousands, is expected for FAIRE-Seq. Have a look on the profiles to see if they look convincing. (By the way, peak calling is pretty much black magic)

0
Entering edit mode

But there are 50k - 100k peaks for each chromosome (!). My organism is maize with big genome (close to human) but this numbers are certainly too high. Of course, I know the peak numbers depend on p/q-value, but nonetheless each program models in some way the background so I'm surprised the results from F-seq are very different from that of MACS2 or Homer (these two gives in most cases comparable number of peaks). So regarding F-seq, I'd definitely try it but the construction of BFF file is not straightforward. I see that the lack of well-estabilished standards makes peak-calling black magic. This is particularly true for chromatin-level studies which are analysed with software tailored mainly for ChIP-seq. Thank you @dariober for your time.

1
Entering edit mode
4.2 years ago
EagleEye 6.8k

I was fan of MACS2 and HOMER untill last month. I came across a tool called PeakRanger, the CCAT algorithm from peakranger package works great for broadPeaks.

Information from Ensembl about CCAT algorithm:

"The algorithm described by Xu et al, is specifically designed for peak calling of broad features, such as H3K36me3. The parameters were set empirically to (fragmentSize 200; slidingWinSize 150; movingStep 20; isStrandSensitiveMode 0; minCount 10; outputNum 100000; randomSeed 123456; minScore 4.0; bootstrapPass 50)."

0
Entering edit mode

Thanks, but I don't have input control.

0
Entering edit mode

HOMER has been updated

0
Entering edit mode

@EagleEye, could you please write the options you have used to call FAIRE peaks with HOMER?

0
Entering edit mode

Hi, I personally do not have experience with analyzing FAIRE-seq (The above tools I mentioned as a general tools I tried for ChIP-seq). You can check the following links and see if it helps.

Encode pipelines

A Comparison of Peak Callers Used for DNase-Seq Data

This article will be useful , if you decide to go for ZINBA.