So I will be starting my third rotation in the next semester, and transitioning from experimental biology work to more computational based biology, where the investigator works with evolutionary genetics in the context of adaptation to environmental stress. I was assigned to do some reading for her lab, and one of the papers already has a bunch of jargon I don't know (thanks to my limited computational background).
They mention that there are, "35,468 transcripts from 29,143 unique loci", which to me sounds like there are alternative splice products, etc. The other thing is the mention of an N50, "When limiting the analyses to the longest transcript from each locus, the transcriptome size was 71,518,404 bp, with an N50 of 3,694 bp and a genome size of approximately 860 Mbp based on C-value estimates. The longest transcript was 66,752 bp in length, stemming from the gene coding for the largest known protein. This indicated that our analysis effectively captured even long transcripts present in the transcriptome."
There is also the mention of WGCNA (Weighted Gene Correlation Network Analysis), "Weighted gene correlation network analysis (WGCNA) of the top 10,000 expressed genes revealed 15 modules of coexpressed genes (fig. 3A). Ten of the 15 modules were significantly correlated with habitat type (presence or absence of H2S), with modules 5 and 10 exhibiting correlation coefficients >0.9 ".
Any help is great, and hoping that I will still want to pursue computational biology after this rotation.