Question

Pseudogenes in bacteria genome

0

Entering edit mode

5.7 years ago

agata88 ▴ 870

Hi all!

I have a problem with defining pseudogenes in bacteria genome. I defined pseudogene as an another copy of gene in genome.

Because my genome is bacteria I don't have any introns, so every same annotation for one gene will be an extra copy - pseudogene. I have 2000 unique genes and 454 repeated at least once. Going this way I found around 1000 pseudogenes. In comparison to other related species this amount is huge - that's why I am suspicious about my results.

So my questions are:

*Which one of defined pseudogenes represent gene and have functionality? How can I find them?

*This may be a stupid question but: Is it possible to have two same annotated genes divided by nucleotides in bacteria genome (one next to another with break)? If yes, is it one gene or gene and its pseudogene? Example below:

gene A_1-AGTCTATGTA-gene A_2

Many thanks for any suggestion.

Best, Agata

pseudogenes • 3.0k views

ADD COMMENT • link updated 4.3 years ago by Fatima ▴ 1000 • written 5.7 years ago by agata88 ▴ 870

4

Entering edit mode

I would disagree with your definition of a pseudogene. I think a pseudogene is when one is duplicated and also deactivated through mutation. It may be lost in some future generation, or conserved for structural reasons, but it should not generate proteins. Generating a protein would promote it to a full gene. So look for deleted start codons, damaged regulatory elements, and evolutionary conservation.

Otherwise, you're looking at duplications that may well be functional. Perhaps that bacteria wants to have doubled expression of that protein, so it has two copies run in succession. It's not a pseudogene at all.

ADD REPLY • link 5.7 years ago by karl.stamm 4.1k

1

Entering edit mode

IMO any ORF that is never transcribed to mRNA can be described as a pseudogene. It doesn't have to exist as multiple copies or anything..

ADD REPLY • link 5.7 years ago by 5heikki 11k

0

Entering edit mode

So, when I have one gene that occur 10 times in genome in different contigs - it can be all functional genes?

ADD REPLY • link 5.7 years ago by agata88 ▴ 870

1

Entering edit mode

Absolutely.

ADD REPLY • link 5.7 years ago by WouterDeCoster 47k

0

Entering edit mode

Yes, but if real, I would guess it is a transposase or something similar. Did you try to annotate the duplicated genes?

ADD REPLY • link 5.7 years ago by h.mon 35k

0

Entering edit mode

yes, I annotated by prokka.

ADD REPLY • link 5.7 years ago by agata88 ▴ 870

0

Entering edit mode

Is this a genome that you have assembled yourself? Could these be assembly artifacts?

ADD REPLY • link 5.7 years ago by Damian Kao 16k

0

Entering edit mode

I don't think this is an assembly artefacts.For de novo assmebly I used SPADes and for artifacts removal - blastn and specific genus nt database to select contigs of interest.

ADD REPLY • link 5.7 years ago by agata88 ▴ 870

0

Entering edit mode

Is the genome a closed single circle? If not then your don't have a complete genome/assembly. It is still a subject for further refinement.

ADD REPLY • link 5.7 years ago by GenoMax 142k

score 1 · Answer 1 · 2018-08-27

edit: are the genome from this question the same you referred to as contaminated on this post: A: Prokka bacteria genome annotation ? If yes, then you need to re-evaluate your contamination removal, everything points it didn't do a proper job.

Regarding you specific questions:

Which one of defined pseudogenes represent gene and have functionality? How can I find them?

As karl.stamm stated above, you have to look at the gene structure to investigate the question.

Is it possible to have two same annotated genes divided by nucleotides in bacteria genome (one next to another with break)? If yes, is it one gene or gene and its pseudogene?

Yes, it is possible. To define if both are functional or not, you have to resort to the answer to your first question.

Now for your genome in particular: how did you detect the duplicated genes? How did you assemble and annotate the genomes? Do you have good sequencing coverage and a good assembly? Did you sequence a bacterial isolate?

Because, indeed, the number of duplicated you found is large and makes me suspicious of an analysis artifact, rather than being truly duplicated genes.

score 0 · Answer 2 · 2020-02-01

0

Entering edit mode

4.3 years ago

Fatima ▴ 1000

You might be able to try your method on https://www.ncbi.nlm.nih.gov/nuccore/AL450380.1, then download its gff3 file, count the /pseudo or pseudogene annotations and compare.

ADD COMMENT • link 4.3 years ago by Fatima ▴ 1000