GridION fastq ASSEMBLY
1
0
Entering edit mode
4.0 years ago

Hey friends! I have a large number of fastq files of COVID-19 from GridION sequencing, I want to produce FASTA files of the assembly results. because the size of the reads is large (about 800-1200 bases) TRINITY didn't work well for me. what other tools can I use to assembly these files??

BTW: I was able to produce BAM sorted files using MINIMAP2 but I need the assembly results.

thank you for your help

Nanopore Assembly fastq • 1.4k views
ADD COMMENT
1
Entering edit mode
4.0 years ago
GenoMax 141k

Use an assembler meant for long nanopore reads. Flye is one of the current favorites.

Since there are many strains available at NCBI you may be fine with aligning to a close reference and then generating a consensus sequence: Generating consensus sequence from bam file

Alignment/assembly are two different things but you are fine with using either in this case.

ADD COMMENT
0
Entering edit mode

thanks! I installed Flye, but it yields an error:

    flye --nano-corr /data/tom/CORONA/fastq/SRR11313278.fastq.gz --genome-size 30k --min-overlap 1000 --out-dir /data/tom/CORONA/test
[2020-04-05 17:26:26] INFO: Starting Flye 2.7-b1587
[2020-04-05 17:26:26] INFO: >>>STAGE: configure
[2020-04-05 17:26:26] INFO: Configuring run
[2020-04-05 17:26:26] INFO: Total read length: 48522
[2020-04-05 17:26:26] INFO: Input genome size: 30000
[2020-04-05 17:26:26] INFO: Estimated coverage: 1
[2020-04-05 17:26:26] WARNING: Expected read coverage is 1, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
[2020-04-05 17:26:26] INFO: Reads N50/N90: 1552 / 973
[2020-04-05 17:26:26] INFO: Selected minimum overlap: 1000
[2020-04-05 17:26:26] INFO: Selected k-mer size: 17
[2020-04-05 17:26:26] INFO: >>>STAGE: assembly
[2020-04-05 17:26:26] INFO: Assembling disjointigs
[2020-04-05 17:26:26] INFO: Reading sequences
[2020-04-05 17:26:26] INFO: Generating solid k-mer index
[2020-04-05 17:26:46] INFO: Counting k-mers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2020-04-05 17:26:46] INFO: Counting k-mers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2020-04-05 17:26:46] INFO: Filling index table
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2020-04-05 17:26:46] WARNING: No overlaps found - unable to estimate parameters
[2020-04-05 17:26:46] INFO: Extending reads
[2020-04-05 17:26:46] WARNING: No overlaps found!
[2020-04-05 17:26:46] INFO: Overlap-based coverage: 0
[2020-04-05 17:26:46] INFO: Median overlap divergence: 0
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2020-04-05 17:26:46] INFO: Assembled 0 disjointigs
[2020-04-05 17:26:46] INFO: Generating sequence
[2020-04-05 17:26:46] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2020-04-05 17:26:46] ERROR: Pipeline aborted
ADD REPLY
0
Entering edit mode

I checked in BLAST to see if there is an overlap and there is enter image description here

ADD REPLY
1
Entering edit mode

screen-shot-2020-04-05-at-20-31-13

ADD REPLY
0
Entering edit mode

Does not look like you have much coverage (at least in that file) if the log above is to be believed.

[2020-04-05 17:26:26] INFO: Total read length: 48522
[2020-04-05 17:26:26] INFO: Input genome size: 30000
[2020-04-05 17:26:26] INFO: Estimated coverage: 1

You may want go the alignment and call consensus route if you don't have enough data for assembly.

ADD REPLY
0
Entering edit mode

look like Flye, didn't work for viruses. what tool is good for alignment? I need to get a sequence

ADD REPLY
0
Entering edit mode

It looks like you don't have enough data for assembly, not that flye does not work. Follow the link above, for getting a consensus sequence from the data you have aligned using minimap2.

ADD REPLY

Login before adding your answer.

Traffic: 2802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6