Annotate VCF with Pathways (without performing GSEA)
10 weeks ago
gernophil ▴ 10

Hey everyone,

I made a variant call and some downstream analysis on the found variants. I found some with significant clinical effects and now I want to annotate these (or all others too) with the pathways they are involved in. If I now perform a GSEA (like with gsePathway) I get an overview of the Pathway that are afftected by these variants. However, I just want my variants to be annotated with the pathway so I can just check my VCF and see what pathway (or other biological function) this specific variant is involved in without checking manually in a database. Is that possible?

Best,

pathway_analysis VCF
10 weeks ago

extract the genes:

unzip -p saved.zip  ReactomePathways.gmt | awk -F '\t' '{for(i=2;i<=NF;i++) {printf("%s\t%s\n",$i,$1);}}'  | sort -t \$'\t' -k1,1


(or may be another file gene-id <-> pathway mapping , I'm not a specialist of reactome)

join the gene names with their positions how to convert a list of genes to BED file?

sort + bgzip + tabix the resulting bed file

annotate your vcf with bcftools annotate and the tabix-ed bed file.

Wow, thank you. I wasn't aware of that file. that really helps a lot. Do I unterstand that file correctly that the first column is always the pathway and the following ones are the involved genes? That's exactly what I need :). (I just would have chosen a different structure.)

10 weeks ago
gernophil ▴ 10

Do you happen to know, if such a file exists for KEGG also?