Dear Friends, Hi ( I'm not native in English so, be ready for some possible language flaws).
I have done DEG analysis on de novo assembly created via Trinity software.
Now I want to pull out my DEG sequences for annotation, and I have created two collection.fasta files containing the ID names of those DEGs (one for Male sample DEGs and one for female sample DEGs)
but the DEG result IDs is as: TRINITY_DN212724_c0_g1 and the structure of Trinity.fasta is as TRINITY_DN212724_c0_g1_i# and it is usual that each gene has several isoformes.
I have used a program called "faSomeRecords" but it seems that it could not collect all the isoform of a gene ID automatically.
I have add "_i* " at the end of all gene IDs in the DEG files, but no chance !
I need a tool to collect all isoforms of some selected genes (collection.fasta file) from Trinity.fasta file, please.
Thank you in advance
Dear st.ph.n,
Thank you for sharing your valuable script!
I will check it.
Hope it works well for you. You should also change your question title to 'Question: how to collect all isoforms of a gene from Trinity.fasta file?'
Yes, I have corrected it.
It seems that it works, Thank you
Dear st.ph.n Hi,
Do you have any script that can check the Trinity.fasta file and give a list of those transcripts (IDs) that have only one isoform ?
I will appreciate if you share your valuable script with me.
Thanks!
Dear st.ph.n, Could you provide for me the original script you used that takes the longest isoform per gene and that writes it to a new fasta? It would be greatly appreciated.
Sorry to bring an older post back up, but I've been trying to use this script and it doesn't seem to be outputting anything for me. Any help would be appreciated!