Question: Method of Checking for Mutation Patterns
0
gravatar for L. A. Liggett
3.9 years ago by
L. A. Liggett120
Broad Institute, Harvard Medical School, Boston Children's
L. A. Liggett120 wrote:

I am trying to think of an approach to understand mutation biases in my data if they exist, but I can't think of a good method. The idea is that I see particular base change biases in my genome sequencing data which match similar published data for example C mutates to T more often than C mutates to A. It is relatively straightforward to just look at the output vcf file and get this information.

However, now I would like to look for more complex patterns. For instance perhaps C often changes to T but the majority of the time this is only in the context of ACA -> ATA, because that the surrounding bases influence the error rate. Similarly we could imagine that any number of surrounding bases might influence the error rate such that perhaps AAACAAA -> AAATAAA is the most prevalent C -> T change.

So I am looking for some guidance or suggestions on how to proceed with an analysis like this. I know some labs have performed and published this type of data but I can't think of how to do it myself.

sequencing • 942 views
ADD COMMENTlink modified 3.9 years ago by genebow160 • written 3.9 years ago by L. A. Liggett120
1
gravatar for igor
3.9 years ago by
igor12k
United States
igor12k wrote:

I don't know if there is already a tool that specifically does this. Since you already have a list of variants, you can use something like bedtools getfasta to retrieve the surrounding sequence for each one. That gives you the genomic context that you would then have to summarize.

ADD COMMENTlink written 3.9 years ago by igor12k

Oh cool I wasn't aware of bedtools getfasta. That is helpful.

ADD REPLYlink written 3.9 years ago by L. A. Liggett120
1
gravatar for genebow
3.9 years ago by
genebow160
USA/Chicago
genebow160 wrote:

You may try NNMF (Non-negative matrix factorization) to analyze the mutation signature of the mutation patterns. The signatures are based on k-mer pattern of the mutations. The following is good reference (Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J., & Stratton, M. R. (2013). Deciphering signatures of mutational processes operative in human cancer. Cell reports, 3(1), 246-259.) link to the paper

ADD COMMENTlink written 3.9 years ago by genebow160

You know this was one of the papers i was remembering, but i didn't remember that they linked to an explanation of how to run the analysis; I see that now, hopefully this will help me solve my problem. Thanks.

ADD REPLYlink written 3.9 years ago by L. A. Liggett120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1055 users visited in the last hour
_