How to identify SSR (Simple Sequence Repeats) for reference based assembly?
2
0
Entering edit mode
7.4 years ago

Can anybody suggest tools to identify SSRs for reference based assembly? Can MISA be used in that case?

SSR • 1.8k views
ADD COMMENT
0
Entering edit mode
7.4 years ago
Tonor ▴ 480

RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2686-2

In the paper they mention Tandem Repeats Finder, scan_for_matches and the ALGGEN software suite.

ADD COMMENT
0
Entering edit mode

So the point is:

  1. I have mapped HQ reads on my reference genome.
  2. Then I have called the consensus for the same.

My consensus sequence have 'N's. Before I can run any tool for SSR identification, shall I remove the 'N's (I guess yes).

OR

Shall I fill the gaps (N's) in the consensus sequence with any gap filler tool?

ADD REPLY
0
Entering edit mode
7.4 years ago

I realized that algorithms to identifying SSRs work on simple pattern matching; as simple as regex prgramming. Hence, it does not matter much when you know what fasta sequence are you supplying to your favourite SSR finding program. MISA will work as good as on consensus sequence as on assembled genome.

ADD COMMENT

Login before adding your answer.

Traffic: 1917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6