Is anyone familiar with how Tophat is running Bowtie? I am assumiung the default paramerters, but can this be modified for improved alignment? or do people just twwek some of the many Tophat parameters? thanks. holly
Is anyone familiar with how Tophat is running Bowtie? I am assumiung the default paramerters, but can this be modified for improved alignment? or do people just twwek some of the many Tophat parameters? thanks. holly
One option in TopHat is --bowtie-n which means you can run it in "-n mode" as described here. The other thing you can change is the -g option:
-g/--max-multihits <int> [ default: 20 ]
This value then is used for both the -k and -m options in Bowtie:
-k <int> report up to <int> good alignments per read (default: 1)
-m <int> suppress all alignments if > <int> exist (def: no limit)
These two options in TopHat are also passed to different Bowtie steps as the -v or -n mismatches allowed values:
--initial-read-mismatches <int> [ default: 2 ]
--segment-mismatches <int> [ default: 2 ]
To edit/add options in Bowtie calls in the TopHat python code, look for the function called bowtie() and the following lines in particular:
bowtie_cmd += [params.bowtie_alignment_option, str(num_mismatches),
"-p", str(params.system_params.num_cpus),
"-k", str(params.max_hits),
"-m", str(params.max_hits),
bwt_idx_prefix]
Then add similar lines as the "-k" line there:
e.g., add a line that looks like this:
"--seed", "123",
Chris
OK, so I've got TopHat to print the Bowtie command it uses now:
So this is a very basic TopHat command:
python ~/src/tophat-1.1.4/src/tophat.py -p4 /scratch2/easih/refs/hg19/bowtie/hg19 test.fsa
And this is the Bowtie command that is run by TopHat:
bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -n 2 -p 4 -k 40 -m 40 /scratch2/easih/refs/hg19/bowtie/hg19 ./tophat_out/tmp/left_kept_reads.fq
Now setting both -m and -g in TopHat leads to these changes:
TopHat:
python ~/src/tophat-1.1.4/src/tophat.py -m1 -g10 -p4 /scratch2/easih/refs/hg19/bowtie/hg19 test.fsa
Bowtie:
bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -n 2 -p 4 -k 10 -m 10 /scratch2/easih/refs/hg19/bowtie/hg19 ./tophat_out/tmp/left_kept_reads.fq
So changing -g to 10 in TopHat, changes the values of -k and -m in Bowtie to 10. But setting -m to 1 in Tophat is not passed to Bowtie in this case (-m in TopHat is probably only invoked in Bowtie if you have spliced alignments).
You can always look in the file tophat_out/logs/run.log to see what bowtie command is set.
I think if you want to use Bowtie options like -a, --best, and --strata, you'll have to edit the tophat.py code. One thing is because -g sets both -k and -m, you end up losing control over what TopHat does with the Bowtie command line. So if you just want one alignment for a read that aligns multiple times, you think to set -g to 1 in TopHat, so that Bowtie gets -k1, but if there are two or more alignments for that read it will be rejected by the fact that Bowtie also gets -m1 (reject reads that align more than once).
Chris
hi chris, still need to process your reply.. but can you tell me how common it is in your own experience for reads to align more than once? Would this mean that Bowtie would throw away any sequence from duplicated regions (STRs, CNVs, gene duplications?). This seems unlikely to me, but thought I better ask. thanks again, holly
thanks so much for this. I am not getting tophat to recognize the -k option. just get the tophat: option -k not recognized are you able to get this to work? thanks again. holly
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
thanks so much for this. I am not getting tophat to recognize the -k option. just get the tophat: option -k not recognized are you able to get this to work? thanks again.
also, do you know if-m to suppress alignments would be confused with Tophat -m for max # splice mismatches in anchor region? – hmortens 1 min ago
Tophat options and Bowtie options should be different