Question: Software for plotting gene presence absence matrix
0
gravatar for natasha
3.2 years ago by
natasha100
natasha100 wrote:

Hi

I have a gene presence-absence matrix of accessory genes, for multiple strains of bacteria. I would like to produce a good figure with this data, and maybe include a tree as well . I was wondering what the best software programmes are available to do this? As I have >4000 genes, I don't each gene to be labelled!

Thanks

matrix python accessory genes R • 2.0k views
ADD COMMENTlink modified 3.2 years ago by Philipp Bayer6.5k • written 3.2 years ago by natasha100

how many strains do you have ?

ADD REPLYlink written 3.2 years ago by Nicolas Rosewick8.3k

50 strains. I make a mistake, I have 1800 genes to plot not 4000.

ADD REPLYlink written 3.2 years ago by natasha100
1
gravatar for Nicolas Rosewick
3.2 years ago by
Belgium, Brussels
Nicolas Rosewick8.3k wrote:

Maybe try a PCA on your data. 4000 observations is quiet big to plot in a figure..

or you could do both (a. PCA ; b. heatmap)

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by Nicolas Rosewick8.3k
0
gravatar for Jean-Karim Heriche
3.2 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche21k wrote:

Perhaps a heatmap would do ?

ADD COMMENTlink written 3.2 years ago by Jean-Karim Heriche21k
1

For 4000 observations? I doubt whether that would send a clear message. Although a heatmap would be a nice combination with a hierarchical clustering tree.

You want to produce a good figure, but you should think first which message you want that figure to contain.

ADD REPLYlink written 3.2 years ago by WouterDeCoster41k

Because I am looking at bacteria of the the same ST, the accessory genes are what make each isolate unique from one another. Sorry I made a mistake -- I only have 1800 genes to plot (4000 is the core genome)

ADD REPLYlink written 3.2 years ago by natasha100

That's still a lot. What is the message your figure should send? You could try as suggested a PCA with an additional hierarchical clustering (tree) and a PCA and see how that looks. There are many R packages that can do that for you.

ADD REPLYlink written 3.2 years ago by WouterDeCoster41k
0
gravatar for Philipp Bayer
3.2 years ago by
Philipp Bayer6.5k
Australia/Perth/UWA
Philipp Bayer6.5k wrote:

How about a gene presence/absence network, or project the presence/absence onto a known phylogeny, like Figures 2 and 3 in the CoPAP paper? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692100/

ADD COMMENTlink written 3.2 years ago by Philipp Bayer6.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1826 users visited in the last hour