Fasta Sequences Alignment And Find Out Gene Biotype Particularly The Non Coding Type
Thanks for looking into this post. I have a file that has fasta sequences in this format


and I am trying to blast it with Mus_musculus.GRCm38.74.ncrna.fa to find out the gene information (biotype) whether it is a non coding or not. And if it is non-coding, whether its pseudogene or IG_V or IG_D etc. Is it possible to do this ? Is it possible to find the gene biotype? Please let me know. I tried using NCBI blast, but it keeps giving me error "Message ID#32 Error: Query contains no data: Query contains no sequence data" - please let me know if there is an alternate way to do this. Thanks

Are these microarray IDs and probe sequences (say, from an Agilent Rat array where the first example you posted is Hepc)? If so, this information might already be in one of the annotations.

Thanks for your reply. Yes you are correct, they are indeed microarray IDs and probe sequences, may I know which annotation I can find the gene biotype information. Moreover I have thousands of these gene lists, and I need to find the gene biotype information on both the sense and antisense, hence I was trying to do two steps - first align these sequences to the mouse fasta file and find the gene biotype information on the same strand. Next step is to reverse complements these sequences, align them to the mouse fasta and find the gene biotype on the opposite strands. Is it possible to do this, especially considering the fact that I am more interested in find the non-coding genes on both sense and antisense strands.

Its a Whole Rat Genome Microarray 4x44K v3 array. Go to the Agilent site and see if you can download the annotation for probes. The annotation file should have the biotype information for you.

Thanks for your reply. Unfortunately I could not locate the corresponding annotation for probes - would you be able to point to the link please.

Thanks - I finally found the annotation file, but unfortunately it has the biotype information only for the gene of interest but not for the opposite strands. Thanks though - I did learn lot.

