filtering out nanopore reads by alignment score
1
0
Entering edit mode
3.4 years ago
resug ▴ 30

I am trying to filter out mitochondrial-specific nanopore reads from whole genome reads using minimap2 and samtools view. In addition I want to retain reads with an alignment score of 5000 or more, following a methodology suggested on a paper. However I don't know what alignment score is and how to adjust it to 5000 to achieve this goal and at what stage. I would appreciate to have your thoughts. Also, I would believe alignment score is different from mapping quality? (samtools view -q) Thanks.

So far my commands go like this:

minimap2 -ax map-ont -t 24 mito_genome.fa nanopore_reads.fastq.gz > mito.sam

samtools view -F 4 mito.sam > mito_mapped.sam

alignment score minimap2 filtering reads samtools • 4.2k views
ADD COMMENT
0
Entering edit mode

Minimap2 reports mapping qualities which should be no more than 60. I don't think this is the same as the alignment score you mentioned. If you want to follow the methodology suggested on a paper, you need to find out how to calculate the alignment score from that paper and filter the reads accordingly.

ADD REPLY
0
Entering edit mode

Right, mapping qualities (MAPQ) are different than alignment scores (AS). In my sam file I found that the AS values are located in the 14th field (e.g. AS:i:5228). In my case I want to retain reads with an AS of 5000 or more, I would appreciate if someone could share a script that would allow me to do this filtering based on AS. Thanks.

ADD REPLY
0
Entering edit mode

Big warning - align all reads - WGS - to the whole genome, otherwise you might be forcing reads from other locations to map to the mito, which might cause false positives. This shouldn't be too bad with long reads but can cause big issues with short reads.

ADD REPLY
0
Entering edit mode

Thanks for your observation. Do you mean mapping the reads to the nuclear genome sequence first, and then filtering out the mito reads from the unmapped reads? To note, there is no whole genome reference available for the species I am working on yet.

ADD REPLY
0
Entering edit mode

Did you find out what alignment score means? Did you run Qiongyi's script?

ADD REPLY
0
Entering edit mode
3.3 years ago
Qiongyi ▴ 180

I wrote a PERL script for your query. You can download the script from https://github.com/Qiongyi/custom_PERL_scripts

Usage:

SAM_Filter_based_on_AS.pl AS_cutoff  your_input_sam_file  output_sam_file

E.g. The below script will extract all the alignment with AS >= 5000.

SAM_Filter_based_on_AS.pl 5000  in.sam  out.sam
ADD COMMENT

Login before adding your answer.

Traffic: 1477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6