Chloroplast and mithocondria genome assembly with SPAdes and Tadpole, correct coverage and kmers
0
0
Entering edit mode
4.7 years ago

Hi! I'm very new to this field, so I fear I'm making very dumb mistakes.

My proccess to assembly these genomes is the following:

We made a whole genome sequencing of purple maize (2x151 bp). Then I mapped all the reads agains the reference genome (10 chromosomes + mitochondria + chloroplast) using bowtie2.

Then, using samtools, I extracted the alignments (bams) only for mitochondria and chloroplast and with samtools fastq I extracted the reads mapped agains the reference mitochondria and chloroplast. I used repair.sh to sort the reads by name and to have the same number of reads per fastq file.

I want to use these reads to do de novo assembly.

For chloroplast, I have a total coverage of 5600x. Doing sampling to have a 60x or 90x of coverage, and kmers of 37,47,57,67 or close, I get a highly fragmented assembly.

For mitochondria, I have a total coverage of 1400x. Doing sampling to have a 60x of coverage and kmers 47,57,67,77 I got an assembly of 46 contigs and using kmers of 45,65,85,95, I got 31 contigs.

Then I used tadpole with 100x coverage and k=100 and got 222 contigs. Also tried to extend and merge my reads and do assembly with 250x coverage and kmer= 250 but got 405 contigs.

I think that my main problem is that I'm not using the correct values of coverage and kmers.

Does anyone has some advice about this? Thanks a lot in advance :)

Thank you so much in advance for your advice

assembly genome next-gen sequence alignment • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6