Changing Bowtie Settings In Tophat?
3
1
Entering edit mode
12.6 years ago
Holly ▴ 30

Is anyone familiar with how Tophat is running Bowtie? I am assumiung the default paramerters, but can this be modified for improved alignment? or do people just twwek some of the many Tophat parameters? thanks. holly

tophat alignment bowtie • 5.8k views
ADD COMMENT
0
Entering edit mode

thanks so much for this. I am not getting tophat to recognize the -k option. just get the tophat: option -k not recognized are you able to get this to work? thanks again.

ADD REPLY
0
Entering edit mode

also, do you know if-m to suppress alignments would be confused with Tophat -m for max # splice mismatches in anchor region? – hmortens 1 min ago

ADD REPLY
0
Entering edit mode

Tophat options and Bowtie options should be different

ADD REPLY
2
Entering edit mode
12.6 years ago
Chris Penkett ▴ 490

One option in TopHat is --bowtie-n which means you can run it in "-n mode" as described here. The other thing you can change is the -g option:

-g/--max-multihits             <int>       [ default: 20               ]

This value then is used for both the -k and -m options in Bowtie:

-k <int>           report up to <int> good alignments per read (default: 1)
-m <int>           suppress all alignments if > <int> exist (def: no limit)

These two options in TopHat are also passed to different Bowtie steps as the -v or -n mismatches allowed values:

--initial-read-mismatches      <int>       [ default: 2                ]
--segment-mismatches           <int>       [ default: 2                ]

To edit/add options in Bowtie calls in the TopHat python code, look for the function called bowtie() and the following lines in particular:

    bowtie_cmd += [params.bowtie_alignment_option, str(num_mismatches),
                     "-p", str(params.system_params.num_cpus),
                     "-k", str(params.max_hits),
                     "-m", str(params.max_hits),
                     bwt_idx_prefix]

Then add similar lines as the "-k" line there:

e.g., add a line that looks like this:

                     "--seed", "123",

Chris

ADD COMMENT
2
Entering edit mode
12.6 years ago
Chris Penkett ▴ 490

OK, so I've got TopHat to print the Bowtie command it uses now:

So this is a very basic TopHat command:

python ~/src/tophat-1.1.4/src/tophat.py -p4 /scratch2/easih/refs/hg19/bowtie/hg19 test.fsa

And this is the Bowtie command that is run by TopHat:

bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -n 2 -p 4 -k 40 -m 40 /scratch2/easih/refs/hg19/bowtie/hg19 ./tophat_out/tmp/left_kept_reads.fq

Now setting both -m and -g in TopHat leads to these changes:

TopHat:

python ~/src/tophat-1.1.4/src/tophat.py -m1 -g10 -p4 /scratch2/easih/refs/hg19/bowtie/hg19 test.fsa

Bowtie:

bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -n 2 -p 4 -k 10 -m 10 /scratch2/easih/refs/hg19/bowtie/hg19 ./tophat_out/tmp/left_kept_reads.fq

So changing -g to 10 in TopHat, changes the values of -k and -m in Bowtie to 10. But setting -m to 1 in Tophat is not passed to Bowtie in this case (-m in TopHat is probably only invoked in Bowtie if you have spliced alignments).

You can always look in the file tophat_out/logs/run.log to see what bowtie command is set.

I think if you want to use Bowtie options like -a, --best, and --strata, you'll have to edit the tophat.py code. One thing is because -g sets both -k and -m, you end up losing control over what TopHat does with the Bowtie command line. So if you just want one alignment for a read that aligns multiple times, you think to set -g to 1 in TopHat, so that Bowtie gets -k1, but if there are two or more alignments for that read it will be rejected by the fact that Bowtie also gets -m1 (reject reads that align more than once).

Chris

ADD COMMENT
0
Entering edit mode

hi chris, still need to process your reply.. but can you tell me how common it is in your own experience for reads to align more than once? Would this mean that Bowtie would throw away any sequence from duplicated regions (STRs, CNVs, gene duplications?). This seems unlikely to me, but thought I better ask. thanks again, holly

ADD REPLY
0
Entering edit mode

In RNA-seq - it seems to be quite common that short reads do map more than once. With the default -g of 40, it will only throw away highly repeated sequences, so some people use a -g of 10.

ADD REPLY
0
Entering edit mode
12.6 years ago
Hmortens • 0

thanks so much for this. I am not getting tophat to recognize the -k option. just get the tophat: option -k not recognized are you able to get this to work? thanks again. holly

ADD COMMENT
0
Entering edit mode

also, do you know if-m to suppress alignments would be confused with Tophat -m for max # splice mismatches in anchor region?

ADD REPLY
0
Entering edit mode

So -g in TopHat sets both -k and -m for Bowtie. So try to set -g in TopHat and that will pass it to Bowtie's -k and -m options.

ADD REPLY
0
Entering edit mode

I had a quick play from home, but am not sure -g really works and can check tomorrow more easily at work.

ADD REPLY

Login before adding your answer.

Traffic: 2581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6