Snp Discovery From De Novo Assembly
6
2
Entering edit mode
10.3 years ago
Ruru ▴ 20

Hai all,

My objective is to discover SNP. Can est sequences or unigene from public database become as a reference sequence? From that, I able to discover the SNP based on alignment my short reads onto reference sequence. For example, papaya draft is still in progress. Can I use information such as contigs & unigene as a reference sequence?

Thanks,

contigs snp • 4.0k views
ADD COMMENT
0
Entering edit mode

Hi archana2287,

I am in the same situation as you were. I was wondering if you were able to identify the SNPs among both the species. Which pipeline did you use? I would really appreciate your help.

ADD REPLY
2
Entering edit mode
10.2 years ago

ESTs are single-pass reads, so will not be a good source to find SNPs. Because you have no way to know if a variation is from sequencing error, or real SNP. Unigenes are better, as they are clusters of several ESTs, but they are still not ideal for references - because there is no guarantee that each SNP position will be covered by two or more reads.

Use genome whenever it is possible, even a draft one like papaya. The papaya genome is ~3X coverage, which means on average each position will be sequenced three times, in general lower error rate than ESTs or unigenes.

ADD COMMENT
1
Entering edit mode
10.3 years ago
Swbarnes2 ★ 1.5k

Sure. But that's not de novo assembly. De novo assembly would be using a program like velvet to build contigs without reference, then using blast to compare your contigs to your draft reference.

ESTs are going to be spliced, right? So you have short reads based on an RNA prep, not a genomic one?

ADD COMMENT
1
Entering edit mode
10.3 years ago
Vitis ★ 2.5k

Unigenes are certainly better than ESTs. I think assembled (or clustered) ESTs and unigenes can be used as reference sequences as long as you keep an eye on the variants detected during the assembling process, and you have reads from mRNAs. For reads from genomic DNAs, I think it's fine to use draft sequences as references. You'll have to verify the interesting SNPs anyway.

ADD COMMENT
0
Entering edit mode
10.3 years ago

The question below contains some advice regarding masking the genome to contain exomes only, this relates to your question in that you have ESTs only. In general don't treat ESTs as if they were a reference genome

Exome Sequencing: Masking The Non-Genic Sequences ?

ADD COMMENT
0
Entering edit mode
10.2 years ago
Ruru • 0

thanks all. definitely yes, my short reads are from RNA sequencing. now I'm getting more clear to choose unigene or est as a reference sequences.

ADD COMMENT
0
Entering edit mode
6.1 years ago
archie ▴ 130

I have RNAseq data of two species. Major objective is to identify SNP among both the species. strategy to find denovo SNPs could be

  1. I did assembly of species 1 using ABYSS
  2. mapped reads of species 2 against the contigs of species 1
  3. Use samtools to find SNPs with minimum depth 10

I want to know whether it is accurate way to find denovo SNPs or not?

ADD COMMENT

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6