Blast Illumina Probe Sequence To Get Their Genomic Coordination
1
0
Entering edit mode
12.4 years ago
Houkto ▴ 220

Hi all,

The annotation file provided by Illumina for RatRef Expression chip is poorly annotated. Since illumina stopped selling these chips they stopped updating their annotation as well. So I am REMOAT annotation and Ensembl annotion to get a better annotation when analysing my data. However, recently we obtained NGS data for our rat strains and we wanted to find out the probes targeted regions SNPs from our analysis. So pulled the probes coordination from REMOAT annotation and try to find if our snps genomic location are included within the range or the probe sequence and I got my results. But I realized that many illumina probe start and end sequence are not always 50bp from both REMOAT and Ensembl. So what I am trying to do now is to blast the probes sequence to rat genome reference and the output will be illumina ID probe sequence and a genomic coordination in an automated way whether using web based program or perl software ? Any suggestions ?

blast sequence • 3.8k views
ADD COMMENT
2
Entering edit mode
12.4 years ago

Keep in mind that the probe sequences are probably NOT TAKEN FROM THE GENOME. They are from the transcriptome and a proportion of them will cross exon boundaries leading to the "length" of the probe being larger than 50bp. Aligning to the genome will get you the "correct" answer for some probes that are entirely contained in a single exon or UTR, but for others you will find that aligning to the genome does not get you every base aligned correctly. If you want to go ahead, give local blat a try. However, know that you are trying to do is not as straightforward as it first seems.

ADD COMMENT
0
Entering edit mode

Thanks for your input Sean, I did realize this after i blasted the sequence. However, blasting the sequence will get me somewhere to look for a SNp within the sequence instead of using the large range of probe sequence. So I am still waiting for an answer directing me to a systematic way of handling big blast jobs whether via NGS tools or using NCBI and/or Ensembl or add 50bp to probe start coordination and subtract 50bp from the probe end coordination and then look for SNPs withen these two seperate dataset. so we can eliminate some of the probes that have been implicated in our analysis.

ADD REPLY
0
Entering edit mode

I'd suggest blat or gmap for alignment. Each will align your 50bp probes to the genome in minutes.

ADD REPLY

Login before adding your answer.

Traffic: 2862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6