How to efficiently count missense mutations from an annotated vcf file?
1
0
Entering edit mode
5 months ago

Hi! I am currently working on my undergraduate study about the frequency of missense mutations in early and advanced stages of early luminal breast cancer. The vcf file contains 47 transcriptomic samples--12 early (stage II) and 35 advanced (III)--and were annotated using SnpEff eff on Galaxy. Is there a tool I can use to efficiently count the mutations by sample and by position per chromosome? I would also appreciate any suggestions for downstream or enrichment analysis for my study.

EDIT: I am more than willing to walk you through the RNA-Seq pipeline I am using on Galaxy. Any assistance or suggestions are much appreciated, as I am admittedly new to bioinformatics.

transcriptome Galaxy breast-cancer • 350 views
ADD COMMENT
0
Entering edit mode
5 months ago

. Is there a tool I can use to efficiently count the mutations by sample and by position per chromosome?

bcftools stats

but your title says "How to efficiently count missense mutations from an annotated vcf file?". So per sample , that it be something like:

bcftools query -l in.vcf | while read S; do echo -s "${S}: " && bcftools view -Ou  --trim-alt-alleles --samples "${S}" in.vcf |bcftools view -c1 | grep -F 'missense' | wc -l ; done
ADD COMMENT

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6