4.5 years ago by
One way I got is using for the nearest genes of each transposon to represent the function of the transposon. I wonder if there are other methods?
tl;dr Without functional data, you can't do this.
While transposons can influence the expression and function of genes, this is not the right approach to get TE function. What you will find if you did this is that some transposons avoid genes (e.g., LTR retrotransposons), while others target genes (e.g., DNA transposons). The species is also important because TE diversity, abundance, and activity varies by lineage by a large amount. Recent studies do show that shared patterns of expression under specific stress, combined with shared motifs of TFs and TEs, suggest correlative function, but a naive computational approach like this won't be reliable.
Or what should I pay attention to when using nearest genes representing transposons for function analysis.
If you are just looking at a genomic region without functional data, I wouldn't take this approach. Though, you can start by looking at the length/number of ORFs in your TEs, the number of Pfam domain matches, and the amount of the transcripts you can align to your TEs. Then, you can classify transposons using matches to Pfam domains and get GO terms to ascertain whether you may have functional TEs (combined with the transcriptome data). Together, this would just tell you the TEs are transcribed and say nothing about whether they are actively transposing. To demonstrate that, along with a molecular/cellular function, would land you in a very high profile journal. There have only been a few (~3) studies that have done that in plants since TEs were discovered ~80 years ago. If you can refine your question I can be more specific about the approach, but there is a huge amount TE literature that should help you get closer to what you want to find.
modified 4 months ago
RamRS ♦ 26k
4.5 years ago by
SES ♦ 8.3k