Hi I have done pan-genome analysis with 0.5% identity cutoff using BPGA tool. and it has given me core reference sequence, accessory reference file and unique sequences files. Now I have a list of sequence and i have aligned them with all 3 files i.e core, accessory and unique. There are some genes that shows alignment with core genome as well as accessory genome sequences. My parameters are >=50% identity, qcovhsp >=90% and evalue 0.0001. How can I segregate genes in core and accessory if they shows alignment with both files?
Not sure why you need to segregate anything when BPGA has already done it for you. The groupings were done based on distribution in multiple genomes, which is a broader relationship criterion than simple sequence similarity.
For genes that have paralogs, one of them may be in the core group, and others are in the accessory groups. Those paralogs could still retain relatively high percent identity and coverage, and most definitely E-value lower than 1e-4. If not universally present in multiple genomes, they will go into accessory group.