Question: Variants calling by bcftools
0
gravatar for Mbillah
3 months ago by
Mbillah110
Chittagong
Mbillah110 wrote:

By this following command I want to detect SNPs and indel. Can any one check my step please? Alignment with mitochondria:

bwa mem -t 8 \
                 -R '@RG\tID:1\tLB:1\tPL:ILLUMINA\tSM:1' \
                 mt_reference.fasta data1.fq.gz data2.fq.gz > aln-mt.sam

Unmapped read:

samtools view -@ 12 -b -S -f 12 aln-mt.sam > unmapped.bam

Note: By this command, I remove mitochondria reads. If I use samtools flag 4, picard gives a mate unpaired error. So that I select samtools flag 12.

Mapped read:

samtools view -@ 12 -b -S -F 12 aln-mt.sam > mapped.bam

Note: By this command, I collect mitochondria reads. If I use samtools flag 4, picard gives a mate unpaired error. So that I select samtools flag 12.

BAM to Fastq

java -jar ~/Softwares/picard.jar SamToFastq \
                           I=unmapped.bam \
                           F=read_without_mt_1.fq \
                           F2=read_without_mt_2.fq

Alignment with Reads(Without mitochondria):

bwa mem -t 8 -R '@RG\tID:1\tLB:1\tPL:ILLUMINA\tSM:1' \
                           ref.fna \
                           read_without_mt_1.fq \
                           read_without_mt_2.fq >aln-pe.sam

Sort:

java -jar ~/Softwares/picard.jar SortSam \
                           I=aln-pe.sam \
                           O=sorted.bam \
                           SORT_ORDER=coordinate

Delete duplication:

java -jar ~/Softwares/picard.jar MarkDuplicates \
                           INPUT=sorted.bam \
                           OUTPUT=dedup.bam \
                           METRICS_FILE=dedup \
                           REMOVE_DUPLICATES=true \
                           CREATE_INDEX=true \
                           ASSUME_SORTED=true \
                           VALIDATION_STRINGENCY=LENIENT \
                           MAX_FILE_HANDLES=2000

Variants calling

bcftools mpileup -O b -f ref.fna \
                           -Q 20 -q 20 \
                           --annotate FORMAT/AD,FORMAT/DP \
                           --threads 4 dedup.bam | \
                           bcftools call -m -v -O z -o 1.vcf.gz -

Note : Now I am looking for filtering my snps and indel. please suggest the filtering parameter.

Count SNPs:

awk '! /\#/' variants.vcf | awk '{if(length($4) == 1 && length($5) == 1) print}' | wc -l

Count Indels:

awk '! /\#/' variants.vcf | awk '{if(length($4) > 1 || length($5) > 1) print}' | wc -l
snp alignment indel • 286 views
ADD COMMENTlink modified 3 months ago by ATpoint14k • written 3 months ago by Mbillah110
1

Hello Mbillah ,

could you please describe why you think that the two alignments are necessary? Why not just align to a reference that include mitochondria genome and "normal" genome?

fin swimmer

ADD REPLYlink written 3 months ago by finswimmer11k

I think sometimes mitochondrial genome increase the read depth, so that I deleted it. Thank you

ADD REPLYlink modified 3 months ago • written 3 months ago by Mbillah110
2
gravatar for ATpoint
3 months ago by
ATpoint14k
Germany
ATpoint14k wrote:

There is no guarantee that removing a chromosome from the reference will cause the reads that in reality originate from that chromosome to be unmapped. In fact, the mitochondrial genome has quiet some homologs in the nuclear genome, so you only provoke false-positives. The reason is that the aligner still tries to find acceptable matches and may therefore consider the second best match (given the best match would be chrM) as the final match, which would be a false-positive here. My opinion on that is that you should include into the reference all the contigs that are available (excluding ALT contins, see C: Inclusion of ALT contigs and decoys in bwa alignment ) in the reference assembly. You can also notably shorten your code by using pipelines, see example code here: A: Help with .sam files _1,_2 from bowtie2 output into svDetect. The file you get when using this code snipped can directly go into bcftools. Duplicates are ignores by default if flagged properly (which the command does using samblaster).

ADD COMMENTlink modified 3 months ago • written 3 months ago by ATpoint14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1866 users visited in the last hour