How to find singletons in get_homologues
1
0
Entering edit mode
6.0 years ago

Hi I'm new to bioinformatics (i'm undergrad). I have 2 genomes, same species, probably different strains. I've used get_homologues to define the core genome, but i'm having trouble to find the singletons. I have already read several times both the manual and tutorial available and i've tried to used parse_pangenome_matrix to find which genes are present in A and absent in B, and it seems that there isn't any! (file with genes present in set A and absent in B (0) ).

How can it be possible, if the core genome is smaller than both genomes?

sequence gene • 910 views
ADD COMMENT
0
Entering edit mode

If they are the same species this is not so surprising.

ADD REPLY
1
Entering edit mode
5.9 years ago
Tm ★ 1.1k

If you have a core gene set and complete gene set, then you can consider subtracting core genes from complete gene set using simple shell command.

For instance,

grep -w -v -f core_gene_set.txt complete_gene_set_sampleA.txt >singleton_sampleA.txt
grep -w -v -f core_gene_set.txt complete_gene_set_sampleB.txt >singleton_sampleB.txt

Here,

  1. -w, means it will search for exact word provided in core gene set file from the complete_gene_set.txt file
  2. -v, means it will search for non-matching lines provided in core gene set file from the complete_gene_set.txt file
  3. -f, means it will grep PATTERN provided in core gene set file from complete_gene_set.txt file
ADD COMMENT

Login before adding your answer.

Traffic: 2984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6