Question: Snp Discovery From De Novo Assembly
1
gravatar for Ruru
7.7 years ago by
Ruru10
Ruru10 wrote:

Hai all,

My objective is to discover SNP. Can est sequences or unigene from public database become as a reference sequence? From that, I able to discover the SNP based on alignment my short reads onto reference sequence. For example, papaya draft is still in progress. Can I use information such as contigs & unigene as a reference sequence?

Thanks,

contigs snp • 3.4k views
ADD COMMENTlink modified 3.2 years ago by chitoboy0 • written 7.7 years ago by Ruru10
2
gravatar for Haibao Tang
7.6 years ago by
Haibao Tang3.0k
Mountain View, CA
Haibao Tang3.0k wrote:

ESTs are single-pass reads, so will not be a good source to find SNPs. Because you have no way to know if a variation is from sequencing error, or real SNP. Unigenes are better, as they are clusters of several ESTs, but they are still not ideal for references - because there is no guarantee that each SNP position will be covered by two or more reads.

Use genome whenever it is possible, even a draft one like papaya. The papaya genome is ~3X coverage, which means on average each position will be sequenced three times, in general lower error rate than ESTs or unigenes.

ADD COMMENTlink written 7.6 years ago by Haibao Tang3.0k
1
gravatar for Swbarnes2
7.7 years ago by
Swbarnes21.4k
Swbarnes21.4k wrote:

Sure. But that's not de novo assembly. De novo assembly would be using a program like velvet to build contigs without reference, then using blast to compare your contigs to your draft reference.

ESTs are going to be spliced, right? So you have short reads based on an RNA prep, not a genomic one?

ADD COMMENTlink written 7.7 years ago by Swbarnes21.4k
1
gravatar for Vitis
7.7 years ago by
Vitis2.1k
New York
Vitis2.1k wrote:

Unigenes are certainly better than ESTs. I think assembled (or clustered) ESTs and unigenes can be used as reference sequences as long as you keep an eye on the variants detected during the assembling process, and you have reads from mRNAs. For reads from genomic DNAs, I think it's fine to use draft sequences as references. You'll have to verify the interesting SNPs anyway.

ADD COMMENTlink written 7.7 years ago by Vitis2.1k
0
gravatar for Istvan Albert
7.7 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

The question below contains some advice regarding masking the genome to contain exomes only, this relates to your question in that you have ESTs only. In general don't treat ESTs as if they were a reference genome

http://biostar.stackexchange.com/questions/4413/exome-sequencing-masking-the-non-genic-sequences

ADD COMMENTlink modified 7.6 years ago • written 7.7 years ago by Istvan Albert ♦♦ 80k
0
gravatar for Ruru
7.6 years ago by
Ruru0
Ruru0 wrote:

thanks all. definitely yes, my short reads are from RNA sequencing. now I'm getting more clear to choose unigene or est as a reference sequences.

ADD COMMENTlink written 7.6 years ago by Ruru0
0
gravatar for archie
3.5 years ago by
archie70
India
archie70 wrote:

I have RNAseq data of two species. Major objective is to identify SNP among both the species. strategy to find denovo SNPs could be 

1. I did assembly of species 1 using ABYSS

2. mapped reads of species 2 against the contigs of species 1 

3. Use samtools to find SNPs with minimum depth 10 

I want to know whether it is accurate way to find denovo SNPs or not ?

 

 

 

ADD COMMENTlink written 3.5 years ago by archie70
0
gravatar for chitoboy
3.2 years ago by
chitoboy0
United States
chitoboy0 wrote:

 

Hi archana2287,

I am in the same situation as you were. I was wondering if you were able to identify the SNPs among both the species. Which pipeline did you use? I would really appreciate your help.

 

ADD COMMENTlink written 3.2 years ago by chitoboy0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1199 users visited in the last hour