Hi all,
I'm currently trying to determine differentially regulated peaks from my ChIP-seq data using the package THOR ( http://www.regulatory-genomics.org/thor-2/basic-intrstruction/ ). As I mentioned in the title, I'm having trouble trying to configure the RGT data folder so that I can use this package for Arabidopsis which is one of the requirements to run THOR.
So far I've found the genome fasta file, chromosome size file, and and the gene annotation gtf file which are all TAIR10 based. However, I just cannot find the gene_regions file and the gene_alias file. The required list of files are listed in the link below. http://www.regulatory-genomics.org/rgt/rgt-data-folder/
I might be asking a very novice question but, I will be very grateful for your help. Thank you in advance.
Best. Tatsuya
You should be able to generate the gene_regions file from the gtf (e.g. using bedops as suggested in https://www.biostars.org/p/56280/). For the gene_alias file you could try "cheating" and just give a tab-separated file containing one row for each gene with all three columns having the same Ensembl gene identifier (which seems to be identical to the TAIR AGIs). Or, you could play with this https://www.arabidopsis.org/download_files/Genes/gene_aliases_20130831.txt (possibly needs to be modified).