Dear all,
I have Illumina short reads (paired-end) plant data for which reference genome is not available. I have to construct assembly with short reads only as I do not have long reads. Plant contains tetraploid and diploid varieties I identified best Kmer 37 and 27 respectively through kmergenie without --diploid parameter as it was omitting some information. Please note that read length is 80-127bp, sequencing depth is 14.9X, total sequences are 4-5 million. Total samples are 40, out of 40 paired end samples 20 are diploid varieties and 20 are tetraploid varieties. Do I need to construct two genomes w.r.t. ploidy?
Please guide me if I have used correct approach. Also which short reads assemblers should I use for plant data.
I tried AbySS tool for multiple samples together using command such as:
abyss-pe np=60 k=37 name=FA2 B=1G in='FA2_18_1.fastq.gz FA2_18_2.fastq.gz FA2_19_1.fastq.gz FA2_19_2.fastq.gz FA2_20_1.fastq.gz FA2_20_2.fastq.gz FA2_21_1.fastq.gz FA2_21_2.fastq.gz FA2_22_1.fastq.gz FA2_22_2.fastq.gz'
Can you please confirm if I can assemble multiple samples together.
Best regards,
Bushra
Abyss output scaffolds.fa file contains short kmer sized scaffolds. I am putting first few lines of scaffolds.fa file here as:
My concern is that why scaffolds are too short of kmer size that is 37. In abyss kmers are not combined to make up contigs and then scaffolds ?