I am processing a vcf file from ExAC browser (release 1.0) with hg19 as human reference genome. I want to identify only those mutations which occur at mRNA positions. Is there any field/column in vcf that can give me that information.
For now, I tried to use "BIOTYPE" field from vcf. I selected only those positions for which "BIOTYPE=Protein_coding". However, when cross-checking these positions with exon start/stop positions from USCS browser, some of these are marked as non-coding RNA. How is this possible or "BIOTYPE=protein_coding" does not give me the right information about RNA type?
Example: 138593 position in chr1 in vcf is marked with BIOTYPE=protein_coding. This position is part of LOC729737 gene. When I look for this position in ucsc file, it is marked as non-coding RNA (as per ucsc kgXref table).