the reference protein sequences used in cBioportal database
Entering edit mode
5.3 years ago

When I was downloading point mutation data for a bunch of genes from cBioportal, I got some .tsv files. In each file, the variation of each amino acid of a gene is represented like "R435Y". I was confused with which protein sequence was selected as the reference sequence as there are so many isoforms for each gene, were they selected the longest one as the reference? I'm not quite clear about the procedure that TCGA or cbioportal used to analysis all kind of mutation data. So I'm looking forward to someone's help through which I can understand how this work and resolve the problem quickly. Many thanks.

TCGA cBioportal • 1.3k views

Login before adding your answer.

Traffic: 1747 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6