Question

making a custom reference

0

Entering edit mode

6.6 years ago

shynilasanthi ▴ 40

I have illumina short reads which I want to call SNP without using a reference genome. I want to make a consenses sequence of some selected reads and then align the other reads to it. so the basic idea is that the consensus sequence need to be act like a "dummy genome" Is there any way to do this?

genome Assembly alignment SNP • 1.2k views

ADD COMMENT • link updated 6.6 years ago by ori ▴ 50 • written 6.6 years ago by shynilasanthi ▴ 40

score 1 · Answer 1 · 2017-12-09

1

Entering edit mode

6.6 years ago

Tm ★ 1.1k

You can first denovo assemble the reads to generate consensus/contigs and then its can be used as reference to map back reads to it and call SNPs.

ADD COMMENT • link 6.6 years ago by Tm ★ 1.1k

score 1 · Answer 2 · 2017-12-10

I think SNP is at the base position that the base in one genome (ex: genome A) is different from the base in the other genome (ex: genome B).
From your question, it seems that the reads derived from genome A and genome B are mixed in the data, and
genome A = sequences produced by some selected reads;
genome B = sequences produced by other reads.

I wonder how you extract reads of genome A from the original data, but if you can do that, I agree with toralmanvar.
You should assemble the reads derived from genome A, and you will get contigs that you called “dummy genome”.
Then you align the reads derived from genome B to the contigs of genome A, and you will detect SNPs.
I recommend that you also align the reads derived from genome A to the contigs of genome A at that time, to compare the allele frequency of two genomes.