Question

Reproducing this picture

0

Entering edit mode

5.2 years ago

zizigolu ★ 4.3k

Hi,

I have 21 .vcf files from tumour Vs normal samples. I likely annotated them individually by Ensembl Variant Effect Predictor (VEP) and snpEff. Now, I have 21 .txt files for each of these tools as results. For example this is results of one .vcf in Vep

http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Ticket?tl=2H9vfkjI8DGQReZk

I don't know how people produce this picture from their results

enter image description here

I found this post

How to create a mutation landscape (waterfall) plot with GenVisR

But, his input for tutorial is not like my Vep or snpEff

Any suggestion please?

genome snp R sequencing • 2.3k views

ADD COMMENT • link updated 5.2 years ago by Chris Miller 22k • written 5.2 years ago by zizigolu ★ 4.3k

score 3 · Answer 1 · 2019-02-07

3

Entering edit mode

5.2 years ago

Chris Miller 22k

GenVisR is a good R package for this, as is ProteinPaint from St Jude: https://pecan.stjude.cloud/proteinpaint

ADD COMMENT • link 5.2 years ago by Chris Miller 22k

score 2 · Answer 2 · 2019-02-07

Did you check the help for waterfall or lolliplot? I haven't used those packages, but it seems to me you will need the genomic coordinates of both the gene as well as the mutations. You will likely have to do some data wrangling to parse the location of the mutation of the file you linked into three distinct columns that R can work with (note how the file from the tutorial has separate columns for chromosome, start and stop) -- all the details should hopefully be explained in the help of both functions, and if not, please be more specific in terms of which details you don't understand.

score 2 · Answer 3 · 2019-02-07

2

Entering edit mode

5.2 years ago

Ram 43k

Like Friederike says, these are called lollipop plots indeed. You can search online for various tools that plot them, such as pbnjay's mutsneedle, cBioPortal's MutationMapper, etc, but each has its own limitations.

A search for mutation plot on Google images picks up this biostars post: How To Create Mutation Diagram In R Or In Any Tools?

ADD COMMENT • link 5.2 years ago by Ram 43k

0

Entering edit mode

Sorry, both Vep and snpEff results don't have reference allele column

I am seeing in most of your kindly suggested tools for visualization we need these columns

Chromosome  Start_Position  End_Position    Reference_Allele    Variant_Allele

Even in Vep results start and end positions have been merged.

I am not sure how to deal with incompatibility in input files.

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

The VCF file that you used for VEP should have that information

ADD REPLY • link 5.2 years ago by Friederike 8.9k

0

Entering edit mode

Sorry for this silly question;

You please imagine I have called somatic indels from cancer Vs normal samples and I have such .vcf files, what is the next step? I googled a lot but I am getting more confused. If my question is the mutation underlying this type of cancer, what these vcf files would say? I saw people use MutsigCV but I don't know why. For example Vep and snpEff do good job in feature selection then why people use MutsigCV.

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

1

Entering edit mode

VEP and snpEff annotate variants. MutSigCV, AFAIK, picks significantly mutated genes, a totally different task. Also, if you're using GDC's MAF format, they refer to Alt alleles differently, Tumor_Seq_Allele1 and Tumor_Seq_Allele2. Ref allele is still called Reference_Allele though, so that should not be the problem.

VCF files WILL have the ref allele column, there can be no VCF file that does not have REF.

EDIT: This offshoot is not related to the original post (which deals with reproducing a lollipop plot), please search the forum for discussions related to VCF annotation/MutSigCV.

ADD REPLY • link 5.2 years ago by Ram 43k

0

Entering edit mode

Thank you, because I thought why I am trying produce a lollipop plot, eventually what is my goal that is why I got concerned

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Sorry, could I use dNdScv instead of MutSigCV for finding significant variants?

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

I'm sorry, I don't know. MutSigCV is as far as my exposure goes.

ADD REPLY • link 5.2 years ago by Ram 43k