Closed:Is my approach for pangenome analysis of Plasmodium right? How can I predict common core genome and genes in my scenario?
0
0
Entering edit mode
4.0 years ago

I am trying out pangenome analysis of 2 types of plasmodium. My aim is to predict the genes of the two types of Plasmodium and then find common core genome on the basis of functional annotation.I am trying to follow EUPAN toolkit http://cgm.sjtu.edu.cn/eupan/sop.html but am stuck with the following questions:

QUESTION1: I have two approaches in mind, which one is proper and need to be carried out?

Approach 1) Merge reference genomes as well as annotation files of both the types of plasmodium> Align my datasets sequences (whole genome sequences od the two types) to the combined file of reference genomes > Identify genes in each of my dataset strains by checking the merged annotation file of both reference genomes.> Classify genes into families> Calculate core and dispensable genome. This method sounds good BUT is it possible to identify genes of both types by simply aligning them to the merged genomes and providing the annotation file? After identification of genes, how would I calculate core genome in this methodology?

Approach 2) Align plasmodium type A dataset files to its reference genome and plasmodium type B files to its ref genome separately> Identify genes in both datasets by using respective annotation files> Cluster the genes into their families> Calculate core and dispensable genome. This method might serve the purpose however how would we merge and see the common core genome and common genes in both?

QUESTION 2: I can find annotated genes of my strains on PlasmoDB database. Is their a way to take the genes from there skipping gene identification steps. How would I be able to combine and chech which genes are present commonly in all genomes and which are exclusive. I assume OrthoMCL will only tell orthologs . How can I proceed in a way so I get the common core genome (either by orthologs or any other way). I can work on linux if the steps aare clearly mentioned.

RNA-Seq gene prediction pangenome linux orthoMCL • 145 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 3434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6