Statistics analysis of assembled genome from PacBio HiFi reads using Hifiasm
2
0
Entering edit mode
29 days ago
Lu K • 0

Hi,

I had consensus reads from a PacBio sequencing. I used the assembler hifiasm to create an assembly on these CCS. I got five .gfa files. I have transformed .gfa file to .fasta contigs using bandage and awk command.

What can be used to get more statistics on the assembly?

ccs gfa pacbio contigs hifiasm • 386 views
ADD COMMENT
1
Entering edit mode

Quast and BUSCO would be my initial suggestions, but it really depends on what specific stats you're looking to compute.

ADD REPLY
0
Entering edit mode

bandage gave me N50, total length but I am particularly looking for coverage and depth of the assembly.

ADD REPLY
0
Entering edit mode

To get that, you should maps the reads back to the assembly (e.g., with minimap2 for long reads) and then use a tool like samtools or mosdepth to get the depth from the sam/bam file.

ADD REPLY
1
Entering edit mode
29 days ago
wrowell ▴ 10

You can get N50/NG50 from https://github.com/lh3/calN50.

ADD COMMENT
1
Entering edit mode
29 days ago
gconcepcion ▴ 80

This py3 script will get you basic fasta stats: https://github.com/PacificBiosciences/pb-assembly/blob/master/scripts/get_asm_stats.py

ADD COMMENT
0
Entering edit mode

this is really great. I am getting the same output as in bandage. However, how can I get the coverage and depth of the assembly?

ADD REPLY

Login before adding your answer.

Traffic: 2750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6