Best visual and interactive genotype matrix (VCF) exploration tool
2
1
Entering edit mode
7.8 years ago
William ★ 5.3k

What is currently the best visual and interactive genotype matrix exploration tool for large genotype matrixes, say the 1000 human genomes VCF?

So 100M plus variants, 1000+ samples, raw uncompressed VCF file size 1TB+.

One requirement is that it should do all kinds of filtering that bcftools (view) does:

http://www.htslib.org/doc/bcftools.html

But BCFTools does not meet the interactive and visual requirements. BCFTools is only interactive for small VCF files or when you use the tabix index for looking up a small region.

Another requirements if that the filtering is visual and interactive, like for example with a small genotype matrix in Excel. (I know bad idea but at least Excel interactive, visual and biologist friendly).

With interactive I mean that a filter criteria can be adjusted and you reall-time get back your updated genotype matrix. Even for complex queries were the full 100M+ variants for all 1000+ samples should be scanned the tool should be interactive.

Does something like this already exist? If so which tools? If not why not?

vcf bcftools interactive visual • 3.6k views
ADD COMMENT
3
Entering edit mode
7.8 years ago

People in my lab use KNIME to open/filter the VCFs : https://www.knime.org/

I'm currently playing with the following java-based tool : it displays 'part' of an indexed VCF file and you can filter the data using a javascript-based expression. you can try it as a webstart application at: http://redonlab.univ-nantes.fr/public_html/jnlp/jfxngs/ (requires java8/java webstart)

ADD COMMENT
0
Entering edit mode

I tried installing on Ubuntu 16.04, and get home/wouter/bin/jvarkit/src/main/java/com/github/lindenb/jvarkit/tools/vcfviewgui/VcfStage.java:74: error: package javafx.beans.property does not exist, followed by many more of similar errors each pointing to a different package that doesn't exist.

Before that error the following was printed:

javac -version  
javac 1.8.0_121  
#compile  
javac -d /home/wouter/bin/jvarkit/_tmp-2.6.1 -g -classpath "lib/com/github/samtools/htsjdk/2.6.1/htsjdk-2.6.1.jar:lib/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar:lib/gov/nih/nlm/ncbi/ngs-java/1.2.4/ngs-java-1.2.4.jar:lib/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:lib/org/apache/commons/commons-jexl/2.1.1/commons-jexl-2.1.1.jar:lib/org/tukaani/xz/1.5/xz-1.5.jar:lib/org/xerial/snappy/snappy-java/1.0.3-rc3/snappy-java-1.0.3-rc3.jar:lib/commons-cli/commons-cli/1.3.1/commons-cli-1.3.1.jar:lib/org/slf4j/slf4j-api/1.7.13/slf4j-api-1.7.13.jar:lib/org/slf4j/slf4j-simple/1.7.13/slf4j-simple-1.7.13.jar" -sourcepath /home/wouter/bin/jvarkit/src/main/java:/home/wouter/bin/jvarkit/src/main/generated-sources/java /home/wouter/bin/jvarkit/src/main/generated-sources/java/com/github/lindenb/jvarkit/util/htsjdk/HtsjdkVersion.java /home/wouter/bin/jvarkit/src/main/java/com/github/lindenb/jvarkit/tools/vcfviewgui/JfxNgs.java

I'm not at all familiar with java-stuff so if you could point me in the right direction, that would be great :p

ADD REPLY
1
Entering edit mode

@WouterDeCoster don't use openjdk (incomplete) but the official oracle java : http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

ADD REPLY
1
Entering edit mode

Right, that did the trick ;-)
I'm going to play a bit with it, for sure looks great. Thanks!

ADD REPLY
0
Entering edit mode

thanks I'm still working on it, I'll happy to get any feedback :-)

ADD REPLY
1
Entering edit mode

I was a bit confused what the 'main screen' "set location of all frames to" option would do, having been too quick in the manual. In hindsight it's clear, but I wouldn't have wondered about it if the text would have been something like "change genomic location of all frames to", with maybe an example "chr17:32232-32932" already entered. Or a gene name, it's not immediately obvious which input is expected.

ADD REPLY
3
Entering edit mode
7.8 years ago
willgilks ▴ 360

I've found the Integrated Genomics Viewer pretty good http://software.broadinstitute.org/software/igv/. It displays vcf files, with variants coloured by genotype and good zoom and scroll functions .. limited for very big files. Also you can view bam and gwas data with it.

enter image description here

ADD COMMENT

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6