which scoring matrix to use in shore (shoremap) analysis
0
2
Entering edit mode
4.5 years ago

I recently started to work with shore and shoremap (mainly because my collaborators requested so) in order to pinpoint ems mutations in an Arabidopsis line they created.

While the majority of the protocol runs successful , I'm in doubt about which scoring matrix to use in the shore consensus step. It seems that there are two available : scoring_matrix_hom.txt and scoring_matrix_het.txt . Though I extensively searched the internet and shore manuals and papers I can not seem to find any decent explanation when to use which one. In the manual of shore they mostly use the _hom one , while in the shoremap manual they use the _het one (in the shore part of the protocol).

It concerns a backcross analysis here but any info for the outcross approach would also be appreciated.

I would be grateful if someone has any insights on this they want to share.

shore shoremap matrix • 877 views
ADD COMMENT
1
Entering edit mode

The paper Plant Genetic Archaeology: Whole-Genome Sequencing Reveals the Pedigree of a Classical Trisomic Line has the following quote:

High-quality single-nucleotide polymorphisms (SNPs) and small deletions between Col-0 and other genotypes were derived from shore consensus by the use of a scoring matrix optimized for identifying homozygous positions that differ from the reference genome (scoring_matrix_hom.txt).

Then, scoring_matrix_het.txt must be optimized to call heterozygous positions.

If your reference genome is the backcross strain, use scoring_matrix_het.txt, otherwise, use scoring_matrix_hom.txt. (Or maybe use both and select the heterozygous variants with scoring_matrix_het.txt and homozygous variants with scoring_matrix_hom.txt?)

ADD REPLY
0
Entering edit mode

thanks for the insights h.mon , very useful !

I was under the impression that since my genome is homozygous I should go for the _hom one . but you seem to indicate differently (and making more sense likely) but could you elaborate a little how you come to that conclusion?

Ah, it's the SNPs that need (are) to be homozygous and nothing to do with the genome state then?

Would all this be different if you wanted to do an outcross analysis? ( sorry for all the additional questions, I'm just trying to get a complete picture).

ADD REPLY
1
Entering edit mode

but could you elaborate a little how you come to that conclusion?

Sadly, I can't... I've only studied SHOREmap manuals for a while, but didn't use it yet.

I think the consideration about het vs hom depends on the level of differences expected between the lineages and the reference genome. But, as you said, there isn't much information about this in SHOREmap docs, and my inference is from the above paper citation alone. If your reference genome is also the backcross (wild) strain, you would expect to see only heterozygous variants in relation to the reference genome, thus scoring_matrix_het.txt would be recommended.

For ems mutations, one would expect them to be novel in relation the the reference genome, so if your cross scheme aims at getting them at homozygous state, then you should use the scoring_matrix_hom.txt.

In which other scenarios one would expect mainly homozygous variants in relation to the reference genome? I don't have much experience with these crosses.

ADD REPLY
1
Entering edit mode

For future reference:

I found this "paper" quite useful (though not giving a definite answer to my original question here): SHOREmap_v3.0

unfortunately also not 100% accurate :/

ADD REPLY

Login before adding your answer.

Traffic: 3997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6