difference between "predicted coding region" and "Hypothetical Protein
1
0
Entering edit mode
9.8 years ago

What is the difference between "predicted coding region" and "Hypothetical Protein" of a gene product in the bacterial genome. Can we call both as Hypothetical protein.

For predicted coding region in genome it specifies as note="hypothetical protein; identified by GeneMark; and the product it says is predicted coding region.

While in case of Hypothetical protein it specifies as note= similarity to other proteins and the product it says is conserved hypothetical protein.

Please help.

Thanks
Poonam

gene-product • 4.4k views
ADD COMMENT
1
Entering edit mode
9.8 years ago
pld 5.1k

A predicted coding region is the sequence within the gene/mRNA that was predicted to contain a translatable sequence. Meaning, it appears that the gene in question could potentially produce a polypeptide.

A hypothetical protein would be the resulting amino acid translation of the predicted coding region.

The difference is that predicted coding sequence is at the nucleotide level while hypothetical protein is at the amino acid level.

I think what it is telling you is that for the predicted CDS, it found something that looks like a protein, but the protein this CDS might encode doesn't look like anything else. So there may be a CDS that may produce a peptide/protein that has no experimental evidence for.

This is different than GeneMark finding the hypothetical protein. Specifically, that there are similar CDS in other bacteria (i.e. it is conserved) and that the protein may contain domains or patterns similar to other known proteins.

So both gene products are hypothetical proteins, but to different degrees. The difference is that in one case GeneMark is only confident about the existence of a possible CDS, and that the putative product of this CDS has no known direct evidence and no identity or function can be inferred from homology. So GeneMark, is basically telling you that there's no evidence to support anything more than a CDS (which by definition makes a hypothetical protein).

The key is the note "note= similarity to other proteins and the product it says is conserved hypothetical protein." GeneMark is telling you that this hypothetical protein has some degree of homology to known proteins. It then tells you that the same/similar hypothetical protein has been found in other bacteria. This is a much stronger piece of evidence, conservation and homology.

In short, a predicted CDS is basically about as weak of a prediction as it can get. A hypothetical protein WITH the added information about homology and conservation is still hypothetical, but has more evidence under it.

ADD COMMENT
0
Entering edit mode

Thanks for your detailed answer.

ADD REPLY
0
Entering edit mode

Why the product is not called hypothetical protein instead of "predicted coding region" when is specifies the product. Although in note = is says as hypothetical protein for the same.

ADD REPLY

Login before adding your answer.

Traffic: 3213 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6