Question: How to set up my vcf with a minimum depth?
1
gravatar for apl00028
8 months ago by
apl0002870
apl0002870 wrote:

I am trying to set up the minimum depth which samtools mpileup considers a polymorphism for RNAseq samples. For that I did this:

I did an align with BWA:

bwa mem scaffold_filt_PV014_bwa PV009_1_cmv3.fq_pairs_R1.fastq PV009_2_cmv3.fq_pairs_R2.fastq > scaffold_filt_OD_PV009.sam

I change sam format to bam format and I sorted it:

samtools view -bS scaffold_filt_OD_PV009.sam | samtools sort - > scaffold_filt_OD_PV009.bam

I index the reference:

samtools faidx scaffold_filt_PV014_index.fa

I defined the snp with bcftool:

bcftools mpileup --redo-BAQ --min-BQ 30  --per-sample-mF --annotate FORMAT/AD,FORMAT/ADF,FORMAT/ADR,FORMAT/DP,FORMAT/SP,INFO/AD,INFO/ADF,INFO/ADR -f scaffold_filt_PV014_index.fa scaffold_filt_OD_PV009.bam | bcftools call --multiallelic-caller --variants-only -Ov > bcf/scaffold_filt_OD_PV009_q10_variant.vcf

I tried to select those polymorphisms with a depth of 20 or more:

cat scaffold_filt_OD_PV009_q10_variant.vcf  | grep -v "DP>=20" > scaffold_filt_OD_PV009_q10_variant_filter.vcf

But this last step has not any effect. So, What is my mistake? How can I solve it? What other alternatives can I do? Thanks in advantage.

snp • 370 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by apl0002870
1

bcftools has various filtering options, please check the manual towards filtering. I strongly recommend to NEVER do any grep or similar things with a VCF, always use dedicated tools for VCF/BCF manipulation, as you might corrupt your file. See for inspiration How to filter vcf file on minimum genotype depth and quality for each sample

ADD REPLYlink written 8 months ago by ATpoint31k

I am trying it using this function of that link: vcftools --vcf scaffold_filt_OD_PV009_q10_variant.vcf --out output_prefix --min-meanDP 20 --recode --recode-INFO-all

but I get SNP with lower depth of 20 and filter some SNP with a depth higher than 20.

When I tried use this function: vcftools --vcf scaffold_filt_OD_PV009_q10_variant.vcf --out output_prefix --minDP 20 --recode --recode-INFO-all

It did not work.

What am I doing wrong? Thanks in advantage

ADD REPLYlink modified 8 months ago • written 8 months ago by apl0002870
1

For your information, you can avoid intermediate files and save time by piping the output of bwa mem directly to samtools and create a sorted bam file:

bwa mem genome.fa reads.fastq.gz | samtools sort -o sorted_alignment.bam

This requires that your samtools is quite recent, but you should always try to use up-to-date software.

ADD REPLYlink written 8 months ago by WouterDeCoster43k
1
gravatar for apl00028
8 months ago by
apl0002870
apl0002870 wrote:

Finaly I got my aim using this function:

 bcftools view --types snps --include 'INFO/DP>=20' scaffold_filt_OD_PV009_q10_variant.vcf > scaffold_filt_OD_PV009_q10_variant_filtered.vcf
ADD COMMENTlink modified 8 months ago by ATpoint31k • written 8 months ago by apl0002870
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1654 users visited in the last hour