Question: Analysis Of Low-Complexity Regions
gravatar for Khader Shameer
8.9 years ago by
Manhattan, NY
Khader Shameer18k wrote:

Following our exome analysis, we found a missense variant in a low-complexity region (LCR) in a protein. There is no domains or motifs exist in the sequence of the protein. Is there any further biological inference that I can get about LCRs using bioinformatics approaches ?

sequence • 4.2k views
ADD COMMENTlink modified 8.8 years ago by Larry_Parnell16k • written 8.9 years ago by Khader Shameer18k

Have you applied the usual variant effect predictors like SIFT, polyphen, etc.?

ADD REPLYlink written 8.8 years ago by Sean Davis26k

@Sean: Yes I did, this particular variant is missense based on PolyPhen.

ADD REPLYlink written 8.8 years ago by Khader Shameer18k
gravatar for Lyco
8.8 years ago by
Lyco2.3k wrote:

I assume that you hope to attach some biological meaning to your variant, despite the fact that it is part of a compositionally biased region. As you certainly know (and Larry has mentioned), the majority of such protein regions are not particularly important. Antigenicity might be an issue, but probably only for somatic mutations (?)

One possibility that you might want to investigate is a short linear motif hidden in the compositionally biased region. Many such motifs like to hide out in natively unfolded regions and might well be associated with a compositional bias. what I would recommend to do is to find a reasonable set of orthologs from species ad various range of evolutionary distances. Then, subject the sequences to a suitable multiple alignment method (probcons or MAFFT/L-INS-I work reasonably well) and see if there is a suspicious conservation of the variant residues or its immediate neighbors. Usually, compositionally biased regions align poorly, but if there is a hidden motif it can often be uncovered by this method. Alternatively, you might check alignment-free motif detection methods like e.g. Gibbs sampling.

Nevertheless, chances are slim and it is much more likely that your variant is just due to the fact that compositionally biased regions tend to be more polymorphic than structured domains.

ADD COMMENTlink written 8.8 years ago by Lyco2.3k

Thanks Lycos. The gene/protein is specific to metazoan lineage. I did an alignment using some of the orthologs and noticed that the region is highly conserved in primates.

ADD REPLYlink written 8.8 years ago by Khader Shameer18k

That is a good start. On the other hand, most protein regions are 'highly conserved in primates', just because there hasn't been enough time for accumulating mutations. In my opinion, a lineage-specific high conservation is meaningful only if e.g. the flanking regions have a lower conservation.

ADD REPLYlink written 8.8 years ago by Lyco2.3k

Thanks for that tip. Here flanking regions are also conserved. So now investigating whether my mutation could be part of a novel domain or eukaryotic linear motifs.

ADD REPLYlink written 8.8 years ago by Khader Shameer18k
gravatar for Larry_Parnell
8.9 years ago by
Boston, MA USA
Larry_Parnell16k wrote:

One thing that comes to mind is a hinge region between segments of secondary structure. So, two tests that one could try are for hydrophobicity/hydrophilicity and for antigenicity. You will have to search for available antigenicity assessment tools as this is something I did many years ago. (Added in edit: To clarify, the antigenicity test can give you some idea of residue exposure and if there is an allele-specific change in that prediction, that can be meaningful and even interpreted as a change in structure. Predictions of random coil can also be used in this regard.)

I assume you have checked for motifs for both alleles of the variant. You can check the DNA for motifs, such as an ESE, exon splice enhancer. ESEs generally map in exons (of course) very near the splice site, but need not. I would also check for microRNA interactions as the variant allele could produce (or abrogate) an mRNA-miRNA interaction. I've written before on BioStar about Brest's paper on IRGM and Crohn's disease with an allele-specific miRNA interaction driving disease risk phenotypes.

If I think of other tests to perform, I'll edit my response. In general, you may want to consider the alleles from the DNA side as well as from the protein side.

ADD COMMENTlink modified 8.8 years ago • written 8.9 years ago by Larry_Parnell16k

Thanks Larry. I have looked at the secondary structure and disorder properties. I will check other sequence properties as you suggested.

ADD REPLYlink written 8.8 years ago by Khader Shameer18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1259 users visited in the last hour