Extract aligned bases for each read from a large mapping data set
15 months ago

Hi

I have a large mapping dataset (about 170 million of reads in the alignment (sam file). I also have the input data (fasta.gz and fastq.gz files) I want to know how to efficiently extract next items using a single script from the sam file. I know samtools stats get some of them, however I would like to have one single script.

• Total bases
• total aligned bases

Thanks for any help

Are you sure you want that in a log? That would be 340 million lines right there!

You can use Qualimap (LINK) as an option to get detailed stats for your alignments.