Question: Using Bowtie To Only Search In Restricted Regions
0
gravatar for Click downvote
6.9 years ago by
Germany
Click downvote670 wrote:

I want to search the rat genome for some very short sequences, about 6-8 nucleotides in length. I'm only interested in results in the promoter or repeat masker regions. How do I restrict my search to these places? I know their locations thanks to the UCSC table browser.

Thanks.

bowtie • 2.1k views
ADD COMMENTlink modified 6.9 years ago by Irsan7.0k • written 6.9 years ago by Click downvote670

How many short sequences do you have? Do you need to capture mismatches, or are exact matches only in the genome acceptable?

ADD REPLYlink written 6.9 years ago by Sean Davis25k
2
gravatar for Irsan
6.9 years ago by
Irsan7.0k
Amsterdam
Irsan7.0k wrote:

Then make fasta-file with all the sequences of the genome you want bowtie to align your 6-8 nucleotides to and build an index of the fasta-file. Then align your nucleotides to your custom index. See bowtie documentation how to do these things

ADD COMMENTlink modified 6.9 years ago • written 6.9 years ago by Irsan7.0k

Thank you for your answer.

My only concern is the following:

Let us say I have a genome

ACAGTACA

and I snip away the GTA part before building an index. Won't the index now tell me that the last nucleotides in the original string begin at the wrong position?

Ie original string and the corresponding index is

ACAGTACA 12345678

When I snip away GTA and index that I get

ACACA 12345

but of course, what I want is

ACACA 12378

ADD REPLYlink written 6.9 years ago by Click downvote670

You can also convert the bases that you are not interested in to N's. So in this case convert GTA to NNN

ADD REPLYlink written 6.9 years ago by Irsan7.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 768 users visited in the last hour