Question: One gene shows alignment with core sequences as well as with accessory gene sequences at 50% identity?
gravatar for sharmatina189059
7 months ago by
United States
sharmatina18905940 wrote:

Hi I have done pan-genome analysis with 0.5% identity cutoff using BPGA tool. and it has given me core reference sequence, accessory reference file and unique sequences files. Now I have a list of sequence and i have aligned them with all 3 files i.e core, accessory and unique. There are some genes that shows alignment with core genome as well as accessory genome sequences. My parameters are >=50% identity, qcovhsp >=90% and evalue 0.0001. How can I segregate genes in core and accessory if they shows alignment with both files?

pan-genome with bpga bpga • 230 views
ADD COMMENTlink modified 7 months ago by Mensur Dlakic7.2k • written 7 months ago by sharmatina18905940
gravatar for Mensur Dlakic
7 months ago by
Mensur Dlakic7.2k
Mensur Dlakic7.2k wrote:

Not sure why you need to segregate anything when BPGA has already done it for you. The groupings were done based on distribution in multiple genomes, which is a broader relationship criterion than simple sequence similarity.

For genes that have paralogs, one of them may be in the core group, and others are in the accessory groups. Those paralogs could still retain relatively high percent identity and coverage, and most definitely E-value lower than 1e-4. If not universally present in multiple genomes, they will go into accessory group.

ADD COMMENTlink written 7 months ago by Mensur Dlakic7.2k

I understand that BPGA is segregating core accessory and unique. Now i have a list of genes in which i want to analyse whether it belongs to core or accessory. For that i did blast and getting same genes in core as well as accessory at 50% identity with >=90%query coverage. Can a single gene belongs to both core and accessory? Can u please tell me why? I am getting very confused..

ADD REPLYlink written 7 months ago by sharmatina18905940

Not sure why you are repeating your question, as I already answered it. A single gene can't be both in core and accessory groups, and it doesn't seem that you are observing anything that contradicts that.

Let's say you have a gene A, with two paralogs: A1 and A2. Gene A is present in all the species in your pangenome, so it goes into core group. Genes A1 and A2 are present only in some species but not all, so they go into accessory group. When you BLAST A1 or A2, they are similar enough to match A in core group, which seems to be what you are seeing. Similarly, BLASTing A would identify A1 and A2 in accessory group, because many paralogs are similar enough to match your identity and coverage criteria.

ADD REPLYlink written 7 months ago by Mensur Dlakic7.2k

If you lower E-value in your search to 1x10-30 or so, you will likely see few if any of these cross-matches.

ADD REPLYlink written 7 months ago by Mensur Dlakic7.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1184 users visited in the last hour