Conserved regions around mutations
0
0
Entering edit mode
3.7 years ago
Gene_MMP8 ▴ 240

I have a list of mutations of interest from the coding region in an experiment that I am performing. I have the mutation position, base substitution type (C>T, A>G, etc), Chromosome, and Gene name as input data. Now I was curious to explore the sequences surrounding those particular mutational positions. To do that, I extracted the raw nucleotide sequences 10 bases up and downstream of the mutation position and plot the sequence logos for the same. This is the image.

  • One thing to note from this image is that C and G nucleotides are highly conserved in the majority of the locations. How do I build a background model for this and argue that whatever I am noticing here is not by chance and is significant?
  • Also, I was also thinking about extracting motifs from the flanking nucleotides and see whether there is an overrepresentation of certain sequence motifs around the mutations. Given I am new to this field, is there a systematic way to do that?

Sequence_logo

next-gen sequencing • 771 views
ADD COMMENT
1
Entering edit mode

Your seqlogo image shows the same proportion of each nucleotide at every location. If you want to get a 'conservation' score out of your region, you need to give it other species' sequences for context. That's going to be tricky to define. Why not just download a public conservation track for the region.

ADD REPLY
0
Entering edit mode

I understand your point. Can you tell me a bit more about downloading a "public conservation track for the region"? Where can I find this?

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6