Question: Is there a better tool for visualizing your variants after annotation than Excel?
4
gravatar for Tania
6 months ago by
Tania120
Tania120 wrote:

Hi everyone

Is there a better tool for visualizing your variants after annotation using annovar or vep than excel, that also goes with git for version control? What do you use? Excel?

Thanks

variants annotations • 980 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by Tania120
2

Define "visualizing" :)

As plain text/graphically ..

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax55k

I mean having a tabular view of your annotations so I can easily go through see what is interesting. The same way we do in Excel, but I don't like Excel for many reasons and looking to know what people use, if any better.

ADD REPLYlink written 6 months ago by Tania120

You can try using a programmer's editor provided you are not saving your files in xlsx/xls format in first place. atom(PC/macOS), Notepad++(PC),BBEdit(macOS) are some of the options. Atom integrates with Git.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax55k
2

If you are trying to use git or version control, consider saving your variant table in .tsv, .csv., .txt, or another raw text based format instead of .xlsx format. In theory this will allow better tracking of changes. Though in practice it becomes a moot point because often when the table changes, every line gets shifted such that entire file is considered 'changed' by git

ADD REPLYlink written 6 months ago by steve1.7k

Thanks Steve. That would help, but I am looking for better tools than excel itself.

ADD REPLYlink written 6 months ago by Tania120
1

Excel is horrible, it will silently mutate your gene names. A good write-up here

ADD REPLYlink written 6 months ago by jasonross1010

Amazing, thank you all :)

ADD REPLYlink written 6 months ago by Tania120
2
gravatar for dariober
6 months ago by
dariober9.4k
Glasgow - UK
dariober9.4k wrote:

First, between annovar and vep I strongly prefer vep with vcf output format. Annovar's output looks good but I found its format to be quite inconsistent. Different fields have different information depending on the region being annotated and parsing it is quite messy (I don't have an example at hand now but I can find some).

Vep's vcf output instead looks ugly at first impression but is very consistent and simple to parse into something human readable. The annotation string in the CSQ tag is just a table with rows separated by , (I think) and columns separated by |, so if you are a bit familiar with command line tools like sed, bcftools etc it's easy to turn it into a simple table.

Having said that, at the cost of seeing my reputation dropping to -Inf, I think Excel is not too bad for eye-balling a table, provided your input is not too big. In fact, you could open the vcf file straightaway and with a couple of passes with Data -> Text to column... you could get something readable.

For some quick, geeky, searches through a possibly big but indexed vcf file the tool I have written, ASCIIGenome, has a print command. It gives a reasonably readable output and it has options to parse the printed output together with search & filter functions like find, grep, awk.

ADD COMMENTlink written 6 months ago by dariober9.4k
2
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

I wrote vcf2table: it displays the VEP + SNPEFF(ANN=) annotations in a table. http://lindenb.github.io/jvarkit/VcfToTable.html

  VEP


+--------------------------+------+----------------+------------+-----------------+--------+------------------+-----------------------------------------------+-------------+---------+-----------------+----------------------+
 | PolyPhen                 | EXON | SIFT           | ALLELE_NUM | Gene            | SYMBOL | Protein_position | Consequence                                   | Amino_acids | Codons  | Feature         | BIOTYPE              |
 +--------------------------+------+----------------+------------+-----------------+--------+------------------+-----------------------------------------------+-------------+---------+-----------------+----------------------+
 | probably_damaging(0.956) | 8/9  | deleterious(0) | 1          | ENSG00000102967 | DHODH  | 346/395          | missense_variant                              | R/W         | Cgg/Tgg | ENST00000219240 | protein_coding       |
 |                          | 3/4  |                | 1          | ENSG00000102967 | DHODH  |                  | non_coding_exon_variant&nc_transcript_variant |             |         | ENST00000571392 | retained_intron      |
 |                          |      |                | 1          | ENSG00000102967 | DHODH  |                  | downstream_gene_variant                       |             |         | ENST00000572003 | retained_intron      |
 |                          |      |                | 1          | ENSG00000102967 | DHODH  |                  | downstream_gene_variant                       |             |         | ENST00000573843 | retained_intron      |
 |                          |      |                | 1          | ENSG00000102967 | DHODH  |                  | downstream_gene_variant                       |             |         | ENST00000573922 | processed_transcript |
 |                          |      |                | 1          | ENSG00000102967 | DHODH  | -/193            | intron_variant                                |             |         | ENST00000574309 | protein_coding       |
 | probably_damaging(0.946) | 8/9  | deleterious(0) | 1          | ENSG00000102967 | DHODH  | 344/393          | missense_variant                              | R/W         | Cgg/Tgg | ENST00000572887 | protein_coding       |
 +--------------------------+------+----------------+------------+-----------------+--------+------------------+-----------------------------------------------+-------------+---------+-----------------+----------------------+

one can filter the data upstream with snpsift of vcffilterso http://lindenb.github.io/jvarkit/VcfFilterSequenceOntology.html

ADD COMMENTlink modified 6 months ago • written 6 months ago by Pierre Lindenbaum112k
1
gravatar for Boris Shilov
6 months ago by
Boris Shilov10
Boris Shilov10 wrote:

If you use R, Rcommander works as a simple spreadsheet viewer pretty well, and readily integrates into your R workflow.

ADD COMMENTlink written 6 months ago by Boris Shilov10
1
gravatar for dariober
6 months ago by
dariober9.4k
Glasgow - UK
dariober9.4k wrote:

In practice, something like this work quite ok for me, usually:

curl -s http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/functional_annotation/filtered/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.annotation.vcf.gz \
| bcftools query -f '%CHROM %POS %CSQ\n' \
| tr '|' ' ' \
| cut -d' ' -f 1-4,7 \
| grep 'UTR' \
| head \
| column -t

1  135000  A  ENSG00000237683  3_prime_UTR_variant
1  135030  G  ENSG00000237683  3_prime_UTR_variant
1  135031  T  ENSG00000237683  3_prime_UTR_variant
1  135094  T  ENSG00000237683  3_prime_UTR_variant
1  135095  T  ENSG00000237683  3_prime_UTR_variant
1  135135  T  ENSG00000237683  3_prime_UTR_variant
1  135151  A  ENSG00000237683  3_prime_UTR_variant
1  135162  A  ENSG00000237683  3_prime_UTR_variant
1  135163  T  ENSG00000237683  3_prime_UTR_variant
1  135173  A  ENSG00000237683  3_prime_UTR_variant

It may like horrible but it can go a long way to eyeball and look for something interesting...

ADD COMMENTlink written 6 months ago by dariober9.4k
1
gravatar for colindaven
6 months ago by
colindaven790
Hannover Medical School
colindaven790 wrote:

I pass them on to users in

a) JBrowse (as VCF, bgzipped and tabixed) to provide data integration with other information and genomic context and

b) as a table (usually VCF in libreoffice XLSX and or Galaxy)

c) but also sometimes use vcftotabular to create a nice but large TSVs from the VCF

Passing 8 million SNVs on isn't that clever, so I teach users to filter variants in Galaxy. I don't use git for version control of variants.

Certainly other visualization tools would be of interest.

ADD COMMENTlink written 6 months ago by colindaven790
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 750 users visited in the last hour