I was successful in running the first three commands of HINT ATAC as shown below following this tutorial https://github.com/sufyazi/sufyazi.github.io/wiki/TF-Footprinting-Tutorial-using-HINT-ATAC-module-from-RGT-toolbox
rgt-hint footprinting --atac-seq --paired-end --organism=mm10 --output-location=/XXX --output-prefix=Afootprints A.mRp.clN.sorted.bam A.mRp.clN_peaks.narrowPeak
rgt-hint footprinting --atac-seq --paired-end --organism=mm10
--output-location=/XXX --output-prefix=Afootprints B.mRp.clN.sorted.bam B.mRp.clN_peaks.narrowPeak
rgt-motifanalysis matching --organism=mm10 --input-files Afootprints.bed Bfootprints.bed
But when I ran the command below for the differential analysis, it was running for 17hr... which became weird to me... has anyone had any experience with this?
rgt-hint differential --organism=mm10 --bc --nc 30 --mpbs-files=./match/Afootprints_mpbs.bed,./match/Bootprints_mpbs.bed --reads-files=A.mRp.clN.sorted.bam,B.mRp.clN.sorted.bam --conditions=A,B --output-prefix=DIFF
I ended up killing the submitted job. And was wondering if my value for -nc was the issue....
The flag –nc
allows for parallel execution of the job -- but how do I know how to adjust the # for -nc?
Also could the job submission script be the issue...? In terms of the memory, nodes and processors that I ask for?
#!/bin/bash
#SBATCH -A OOOOO
#SBATCH -p OOOOO
#SBATCH -t 24:00:00
#SBATCH --mem=40GB
#SBATCH --chdir=OOOOO
#SBATCH -o "%x.o%j.log"
#SBATCH --nodes=1
#SBATCH -n 10
#SBATCH --job-name=footprinting.ATAC
#SBATCH --mail-user=OOOO
#SBATCH --mail-type=BEGIN,END,FAIL
I apologize because I know these questions may be very simple. Still trying to get the hang of the bioinformatics :)
Thanks for the help!