Hello Biostars,
After examining the human GENCODE v22 release I noticed something peculiar, SHANK3 (ENSG00000251322) is listed as lncRNA with the biotype as a processed transcript. However, SHANK3 is a well known protein coding gene and even has an entry in the PDB http://www.rcsb.org/pdb/protein/Q9BYB0evtc=Suggest&evta=ProteinFeature%20View&evtl=OtherOptions
I thought this was simply an error in the gencode file at first but even Ensembl has annotated SHANK3 as a non-coding transcript
In addition, if you compare the transcript table from the above link which uses the GrCH38 assembly to the previous assemblies (GrCH37) record for SHANK3 here at http://grch37.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000251322;r=22:51112843-51171726
You can see the exact same transcripts as denoted by the Ensembl Transcript ID switch from protein-coding in gr37 to non-protein coding in gr38. How can this be? What happened to those proteins we already found? NCBI still lists SHANK3 as a protein-coding gene.
I'm hoping someone can help me out if I am missing or overlooking something.
Thanks,
Bgood