Question

Does anyone know how to condense genes in HOMER? (-condenseGenes is not working properly for me)

0

Entering edit mode

6.3 years ago

baunruh ▴ 10

Hello,

I am using the mm9 UCSC gene annotation with Homer to quantify repeats using the analyzeRepeats command. I am using the options -count cds and -tpm and -d options for this. All of those commands work just fine and I can get an output, but I want to condense my genes to make it so it reports all isoforms in 1 annotation line for each gene. This program provides a command line function -condenseGenes which should condense them to the Gene_id. My GTF file doesnt appear to have a gene_id line and Ive had to add it in after.

When I run the -condenseGenes option, the error output I get informs me that the fasta mm9.fa file that I am using can no longer be found whereas it has no trouble finding the mm9.fa file when I run it without -condenseGenes.

Anyone have a suggestion how I can condesne these genes?

RNA-Seq rna-seq software error alignment • 1.9k views

ADD COMMENT • link 6.3 years ago by baunruh ▴ 10

1

Entering edit mode

Can you assist us by pasting some of the commands that you are using? Also, some entries from your GTF?

Note that the GTFs from GENCODE have gene_id: http://www.gencodegenes.org/mouse_releases/

ADD REPLY • link 6.3 years ago by Kevin Blighe 88k

0

Entering edit mode

Thank you for your response!

I am using the line of code

analyzeRepeats.pl /home/mm9.gtf /home/mm9 -count cds -tpm -d /home/RP_ZT0 > RP_ZT0_Homer.out

So this set of commands works perfectly fine, but when I add in the following:

analyzeRepeats.pl /home/mm9.gtf /home/mm9 -condenseGenes -count cds -tpm -d /home/RP_ZT0 > RP_ZT0_Homer.out

I get an error that says "cant find mm9 genome, assuming mm9 is the name of the organism" then that causes it to fail down the line.

ADD REPLY • link 6.3 years ago by baunruh ▴ 10

0

Entering edit mode

That's strange - looks like a bug in the coding of the program.

Did you try to move the -condenseGenes parameter to different positions in the command line?

ADD REPLY • link 6.3 years ago by Kevin Blighe 88k

0

Entering edit mode

I tried different positions in the command line, and that had no effect. However, I am running this on a computer cluster and maybe that is causing some problems. My university has a few different clusters and when I ran it on another cluster it gave me a different error, it was unable to recognize the command at all and said unable to find command -c, as if it cut off the command. I think I will just have to find a way around this and use the raw reads to normalize and condense the genes.

ADD REPLY • link 6.3 years ago by baunruh ▴ 10

0

Entering edit mode

Okay, are you pasting the commands from a Windows / MAC text file into a terminal window accessing the cluster? Formatting issues are common, like hidden end-lines, tabs, etc. Also, sometimes a hyphen is not quite a hyphen...

You should literally just check to see if the reference genome FASTA exists where you are running the command. You may not have root access, but you should be able to read file listings.

ADD REPLY • link 6.3 years ago by Kevin Blighe 88k

0

Entering edit mode

Im just going to get the reads in raw read count so I can add the genes up myself. This seems to be the best option at this point. I can calculate TPM from there.

ADD REPLY • link 6.3 years ago by baunruh ▴ 10

0

Entering edit mode

Im starting to think that may be the problem.

ADD REPLY • link 6.3 years ago by baunruh ▴ 10

0

Entering edit mode

You should report this as a potential bug with the developers. Just be sure that -condenseGenes is indeed compatible with the command that you're running.

ADD REPLY • link 6.3 years ago by Kevin Blighe 88k

0

Entering edit mode

I figured it out, I had to download some dependencies that I did not have, basically I had to walk through the configuration file and there were optional download files that did not come in the base download.