Fasta Sequences Alignment And Find Out Gene Biotype Particularly The Non Coding Type
0
0
Entering edit mode
10.4 years ago
hbr721 • 0

Thanks for looking into this post. I have a file that has fasta sequences in this format

>A_42_P574859    
AAGGAAGCGAGACACCAACTTCCCCATATGCCTCTTCTGCTGTAAATGCTGTAAGAATTC
>A_43_P11616    
TCTGTGAAATCCTCAGTGTTCAATCCAGACTCAGTAGTATATTACAGTTTTCTGTAAGAG
>A_42_P681722    
TTAGTTTAAGACTGTATGTACTGTATCATAGTAGGACCTATGTATCTCATCGCTGTGATG
>A_44_P371339    
CCCTAGTTCATATCTTCAAACAAGAGATAAAAGACTCATATAAAATAGTCCTTCCTACCC
>A_44_P164990    
ACGAATGCATTGTGAAGACCATTCCCAATGAACTCTATTGAATGTCTAATACACAGGTAT

and I am trying to blast it with Mus_musculus.GRCm38.74.ncrna.fa to find out the gene information (biotype) whether it is a non coding or not. And if it is non-coding, whether its pseudogene or IG_V or IG_D etc. Is it possible to do this ? Is it possible to find the gene biotype? Please let me know. I tried using NCBI blast, but it keeps giving me error "Message ID#32 Error: Query contains no data: Query contains no sequence data" - please let me know if there is an alternate way to do this. Thanks

gene • 5.9k views
ADD COMMENT
0
Entering edit mode

Are these microarray IDs and probe sequences (say, from an Agilent Rat array where the first example you posted is Hepc)? If so, this information might already be in one of the annotations.

ADD REPLY
0
Entering edit mode

Thanks for your reply. Yes you are correct, they are indeed microarray IDs and probe sequences, may I know which annotation I can find the gene biotype information. Moreover I have thousands of these gene lists, and I need to find the gene biotype information on both the sense and antisense, hence I was trying to do two steps - first align these sequences to the mouse fasta file and find the gene biotype information on the same strand. Next step is to reverse complements these sequences, align them to the mouse fasta and find the gene biotype on the opposite strands. Is it possible to do this, especially considering the fact that I am more interested in find the non-coding genes on both sense and antisense strands.

ADD REPLY
1
Entering edit mode

Its a Whole Rat Genome Microarray 4x44K v3 array. Go to the Agilent site and see if you can download the annotation for probes. The annotation file should have the biotype information for you.

ADD REPLY
0
Entering edit mode

Thanks for your reply. Unfortunately I could not locate the corresponding annotation for probes - would you be able to point to the link please.

ADD REPLY
0
Entering edit mode

Thanks - I finally found the annotation file, but unfortunately it has the biotype information only for the gene of interest but not for the opposite strands. Thanks though - I did learn lot.

ADD REPLY

Login before adding your answer.

Traffic: 1227 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6