call variants for every read
1
0
Entering edit mode
3.2 years ago
mimakaev • 0

My goal is to align Illumina reads to the reference genome and look at variants within each read. The eventual goal is to look at co-occurrence of variants on the two side of the paired end read, in a metagenomic sample.

I found many tools that would call variants from the whole sample: iVar, bcftools+samtools, etc. However, I need variants called for every read

I also found that minimap2 can output the "--cs" tag in bam files. But that seems to be far from the proper variant calls (A25101G would be what I need).

alignment snp next-gen • 1.1k views
ADD COMMENT
0
Entering edit mode

so you want to look at each paired read and detect each SNV occuring in the overlapping interval ?

ADD REPLY
0
Entering edit mode

I want to look at each paired read and detect each SNV occurring in the read. Paired end sequencing is used just to increase a chance of a read hitting several SNPs with one read.

ADD REPLY
1
Entering edit mode
3.2 years ago

I wrote http://lindenb.github.io/jvarkit/Biostar489074.html it's a SNV caller which only use the overlapping part of the paired-end reads. It only report the number of time the ALT allele was found (but doesn't report the number of time the REF allele was found)

input must be sorted on read name using samtools sort -n or samtools collate

it will be much faster is the reads belong to one chromosome

Example

$ samtools view -O BAM --reference "ref.fasta" in.cram "chr22:41201525-41490147" |\
    samtools collate -O -u - |\
    java -jar dist/biostar489074.jar --reference "ref.fasta"
ADD COMMENT

Login before adding your answer.

Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6