Question: Map bed file probes against genome
gravatar for olikidrod
2.1 years ago by
olikidrod0 wrote:

I have a .bed file containing probes for a commercial microarray and would like to map these against a reference genome in NCBI, for which the probes were not originally designed. I'd like to see how many map with near perfect identity; if it's the majority, we'd go ahead and use it. Can anyone suggest a good strategy for doing this?

alignment • 762 views
ADD COMMENTlink modified 2.1 years ago by h.mon30k • written 2.1 years ago by olikidrod0
gravatar for h.mon
2.1 years ago by
h.mon30k wrote:

Use bedtools getfasta to extract the sequences of the probes from the genome it was designed (or contact the microarray manufacturer and ask for the probes in fasta format).

bedtools getfasta -fi oldGenome.fasta -bed probes.bed > probes.fasta

Then use bowtie or bowtie2 with --end-to-end --very-fast to map the probes into the new genome. You can also use bbmap to map, the option mhist=mhist.txt idhist=idhist.txt would be helpful in checking how good the probes match the "new" genome.

bowtie2 -f -a --very-sensitive -x newGenome -U probes.fasta > probes.sam

You can also get a quick and dirty histogram by looking at the CIGAR string from the alignment file:

samtools view probes.bam | cut -f 6 | sort -n | uniq -c

   1 144M
   3 145M
   1 145M5S
   1 145M6S
   2 146M
   2 146M5S
   5 147M
   1 147M4S
  51 148M
 160 149M
 695 150M
1480 151M
ADD COMMENTlink written 2.1 years ago by h.mon30k

Brilliant -- thank you!

ADD REPLYlink written 2.1 years ago by olikidrod0

Hello olikidrod,

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted.


ADD REPLYlink written 2.1 years ago by finswimmer13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1061 users visited in the last hour