How to filter reads only with single SNV
2
0
Entering edit mode
3.0 years ago

Hi all,

I'd like to know whether there are any tools to filter out only reads with a single SNV/mutation.

In detail, I generated a randomly mutated library of a protein-coding gene, screened non-functional variants, and then sequenced their CDS region. Unfortunately, many of the reads have multiple mutations at once so it's hard for me to find which mutation is critical for the loss-of-function. So I'd like to use reads only with a single mutation/SNV compared to the reference sequence for my variant calling.

For example,

reference:   actgggtacccattgactagataccgta
read 1:      actgggtCcccattgactagataccgta   <- read I want to get (read with a single mutation)
read 2:      actgggtacGcattgactaTataccgta  <- read I don't want to use (read with more than one mutation)

I'd appreciate it a lot if you give me any advice on this. Thank you!

filtering_reads SNV amplicon_sequencing SNP • 1.1k views
ADD COMMENT
0
Entering edit mode
3.0 years ago

filter with edit-distance NM==1

samtools view -e '[NM]==1'
ADD COMMENT
0
Entering edit mode

Hi, thank you for your answer. But I can't find any command '-e' in samtools view (http://www.htslib.org/doc/1.11/samtools-view.html)..

ADD REPLY
0
Entering edit mode

Oh, I found the command:

samtools calmd -e  input.sam reference.fa | awk '$1 ~ "^@" || $13 ~ ":1$"'  | samtools view -b - > output.sam

$13: NM

Thanks for your advice, Pierre!!

ADD REPLY
0
Entering edit mode

sunhyunchang : You need to upgrade to latest version of samtools which is 1.12 as of today. This version introduced the option that @Pierre mentions above.

ADD REPLY
0
Entering edit mode
3.0 years ago

At the read level, single base deviations from the reference are likely to be errors, not real SNPs.

ADD COMMENT

Login before adding your answer.

Traffic: 2052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6