Entering edit mode
7.4 years ago
Mehmet
▴
820
Dear All
I want to find overlapping sequences (overlapping genes ) in two assemblies. For instance;
transcriptome A:
gene1 AGTAGTACTG
transcriptome B:
gene1 AGTAGCTGAT
first I want to compare all genes in two assemblies and find overlapping genes. Later, I want to do a venn diagram to show overlapping genes (number of genes in transcriptome A only, number of genes in transcriptome B only, and shared genes between two assemblies). My point is to find out how similar two transcriptomes.
Take a look at some of these tools: https://omictools.com/assembly-reconciliation-category
A homebrew way to do that would be to align the multifasta containing the transcriptome A transcript sequences to the one of transcriptome B, perhaps using BLAT which outputs an easily processable PSL file (although old as format). You can set the -minIdentity of sequence in BLAT, which helps in having a clean output, and you can also set other interesting parameters. Beware, it can be very slow.