filtering out nanopore reads by alignment score
1
0
Entering edit mode
9 months ago
resug ▴ 10

I am trying to filter out mitochondrial-specific nanopore reads from whole genome reads using minimap2 and samtools view. In addition I want to retain reads with an alignment score of 5000 or more, following a methodology suggested on a paper. However I don't know what alignment score is and how to adjust it to 5000 to achieve this goal and at what stage. I would appreciate to have your thoughts. Also, I would believe alignment score is different from mapping quality? (samtools view -q) Thanks.

So far my commands go like this:

minimap2 -ax map-ont -t 24 mito_genome.fa nanopore_reads.fastq.gz > mito.sam

samtools view -F 4 mito.sam > mito_mapped.sam

0
Entering edit mode

Minimap2 reports mapping qualities which should be no more than 60. I don't think this is the same as the alignment score you mentioned. If you want to follow the methodology suggested on a paper, you need to find out how to calculate the alignment score from that paper and filter the reads accordingly.

0
Entering edit mode

Right, mapping qualities (MAPQ) are different than alignment scores (AS). In my sam file I found that the AS values are located in the 14th field (e.g. AS:i:5228). In my case I want to retain reads with an AS of 5000 or more, I would appreciate if someone could share a script that would allow me to do this filtering based on AS. Thanks.

0
Entering edit mode

Big warning - align all reads - WGS - to the whole genome, otherwise you might be forcing reads from other locations to map to the mito, which might cause false positives. This shouldn't be too bad with long reads but can cause big issues with short reads.

0
Entering edit mode

Thanks for your observation. Do you mean mapping the reads to the nuclear genome sequence first, and then filtering out the mito reads from the unmapped reads? To note, there is no whole genome reference available for the species I am working on yet.

0
Entering edit mode
9 months ago
Qiongyi ▴ 120

Usage:

SAM_Filter_based_on_AS.pl AS_cutoff  your_input_sam_file  output_sam_file


E.g. The below script will extract all the alignment with AS >= 5000.

SAM_Filter_based_on_AS.pl 5000  in.sam  out.sam