Question: Fasta Sequences Alignment And Find Out Gene Biotype Particularly The Non Coding Type
0
gravatar for hbr721
5.9 years ago by
hbr7210
hbr7210 wrote:

Thanks for looking into this post. I have a file that has fasta sequences in this format

>A_42_P574859    
AAGGAAGCGAGACACCAACTTCCCCATATGCCTCTTCTGCTGTAAATGCTGTAAGAATTC
>A_43_P11616    
TCTGTGAAATCCTCAGTGTTCAATCCAGACTCAGTAGTATATTACAGTTTTCTGTAAGAG
>A_42_P681722    
TTAGTTTAAGACTGTATGTACTGTATCATAGTAGGACCTATGTATCTCATCGCTGTGATG
>A_44_P371339    
CCCTAGTTCATATCTTCAAACAAGAGATAAAAGACTCATATAAAATAGTCCTTCCTACCC
>A_44_P164990    
ACGAATGCATTGTGAAGACCATTCCCAATGAACTCTATTGAATGTCTAATACACAGGTAT

and I am trying to blast it with Mus_musculus.GRCm38.74.ncrna.fa to find out the gene information (biotype) whether it is a non coding or not. And if it is non-coding, whether its pseudogene or IG_V or IG_D etc. Is it possible to do this ? Is it possible to find the gene biotype? Please let me know. I tried using NCBI blast, but it keeps giving me error "Message ID#32 Error: Query contains no data: Query contains no sequence data" - please let me know if there is an alternate way to do this. Thanks

gene • 3.2k views
ADD COMMENTlink modified 5.9 years ago by Pierre Lindenbaum124k • written 5.9 years ago by hbr7210

Are these microarray IDs and probe sequences (say, from an Agilent Rat array where the first example you posted is Hepc)? If so, this information might already be in one of the annotations.

ADD REPLYlink modified 5.9 years ago • written 5.9 years ago by Devon Ryan92k

Thanks for your reply. Yes you are correct, they are indeed microarray IDs and probe sequences, may I know which annotation I can find the gene biotype information. Moreover I have thousands of these gene lists, and I need to find the gene biotype information on both the sense and antisense, hence I was trying to do two steps - first align these sequences to the mouse fasta file and find the gene biotype information on the same strand. Next step is to reverse complements these sequences, align them to the mouse fasta and find the gene biotype on the opposite strands. Is it possible to do this, especially considering the fact that I am more interested in find the non-coding genes on both sense and antisense strands.

ADD REPLYlink written 5.9 years ago by hbr7210
1

Its a Whole Rat Genome Microarray 4x44K v3 array. Go to the Agilent site and see if you can download the annotation for probes. The annotation file should have the biotype information for you.

ADD REPLYlink written 5.9 years ago by Ashutosh Pandey11k

Thanks for your reply. Unfortunately I could not locate the corresponding annotation for probes - would you be able to point to the link please.

ADD REPLYlink modified 5.9 years ago • written 5.9 years ago by hbr7210

Thanks - I finally found the annotation file, but unfortunately it has the biotype information only for the gene of interest but not for the opposite strands. Thanks though - I did learn lot.

ADD REPLYlink written 5.9 years ago by hbr7210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 955 users visited in the last hour