Dan Graur is the author of the book Molecular and Genome Evolution (2016). Dan Graur has a very low threshold for hooey, hype, hypocrisy, postmodernism, bad statistics, ignorance of population genetics and evolutionary biology, and hatred of any kind. This blog is a diary of peeves, dislikes, antipathies, annoyances, and random feelings of contempt. Rarely, do I have good things to say.
Following a silly letter from Scotland, I found it necessary to state very, very clearly that all the opinions expressed in this blog are my own and do not represent the views of either my academic employer or the current Secretary of the Flat Earth Society.
Pentecostalism is a renewal movement within Christianity, and like many such movements, it adheres to the inerrancy of scripture and the necessity of accepting Christ as personal Lord and Savior, while placing special emphasis on direct personal experiences. A main tenet of Pentecostalism is the belief in empowerment, which leads to spiritual gifts, such as speaking in tongues (glossolalia), divine healing, and the ability to handle snakes without being harmed.
The interesting thing about speaking in tongues is that no Pentecostal has never been caught speaking a real minority language like Albanian, Romansh, or Irish Gaelic. Neither were they caught speaking a dead language. The “tongues” Pentecostals speak are always a sort of guttural gibberish that only a high Pentecostal priest can understand and interpret.
Divine healing involves the irrational belief that by performing a series of rituals in a certain order, diseases will disappear, cancer will regress, and demons will be exorcised. The snake bit is of particular interest since it involves the belief that the practitioner is immune from even the worst of the worst, let alone criticism by outsiders and twitter trolls. This immunity from everything is of particular importance to the story I am about to tell.
My story concerns people who are convinced that genome-wide association studies (GWAS), which I call GWASturbation and machine learning, will bring joy and salvation to the human race. These people seem to have quite a lot in common with Pentecostals. First, they adhere to the irrational belief that by performing a series of rituals in a certain order, diseases will disappear and cancers will regress. The belief is irrational since nothing ever came out of these endeavors except huge headlines in the news media and a bit of scandal. Second, these people have a tendency to speak in tongues. Take for example the following paragraph from a PLoS Geneticsarticle entitled: “Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs” by Christopher D. Brown, Lara M. Mangravite, and Barbara E. Engelhardt:
“We identified, for each gene expression trait, the most highly associated SNP within each local linkage disequilibrium (LD) block. We tested the independence of each SNP by multivariate regression and took the union of the independently associated SNPs for each gene. We refer to, for example, the first and second most significant, independently associated SNPs as primary and secondary SNPs, respectively, and we refer to the set of primary SNPs as first tier, or tier 1, extending in the straightforward way through tier 4. We do not consider tiers beyond the fourth tier because of lack of statistical power. For each study, and within each tier, we independently estimated false discovery rates (FDRs) by permutation. Although we computed a BF for every SNP-gene pair, we limit our subsequent analysis to cis-linked SNPs, or SNPs within 1 Mb of the transcription start site (TSS) or transcription end site (TES) of a gene. While we have standardized analysis and reporting across studies, we have not considered the scope of differences in eQTL discovery based on alternative data analysis pipelines.”
I challenge any person who is not intimately involved with Pentecostalism or does not GWASturbate regularly to translate the above paragraph into English.
Or take this paragraph:
“We chose to control for the confounding effects of both known covariates and unknown factors by removing the effects of principal components (PCs; Figure S1, Table S1) [36], [37]. Given that a diverse set of demographic (e.g., age, sex), environmental (e.g., BMI, drug use), and technical (e.g., post-mortem interval, array batch, ozone levels, identity of the technician who handled the arrays) variables are known to be associated with gene expression measurements and to confound eQTL ascertainment [26], [36], [37] we felt it was critical to control for these effects in the most consistent way possible prior to eQTL mapping. Across the diverse set of studies examined here, the covariate annotation ranges from non-existent to detailed. To address this non-uniformity, we analyzed each data set with the same approach, irrespective of covariate annotation. Multiple independent studies demonstrate the effectiveness of controlling for latent variables with respect to eQTL ascertainment; indeed, controlling for PCs substantially increases power to detect cis-eQTLs within these studies [26]. Importantly, it has also been demonstrated that each of these eQTL discoveries is also more likely to replicate across studies [26].”
Or this one:
“We next sought to investigate the biological characteristics associated with the reproducibility and cell specificity of eQTLs. To do this, we quantified the overlap between cis-eQTL SNPs and genomic features associated with functional cis-regulatory elements (CREs), including DHS sites, chromatin marks, and binding sites for transcription factors and other DNA associated regulatory proteins (see Table S3 for full list of data sets). We categorized regions of open or activating chromatin, and regions of transcription factor or DNA protein binding as activating CREs, and regions of repetitive, repressive, or heterochromatic chromatin domains as repressive CREs, to draw a contrast between genomic regions where transcription factor binding is frequent and regions where it is discouraged or unlikely. We focused analyses of LCL eQTL SNPs on CRE data sets produced in LCLs (primarily GM12878) and analyses of liver eQTLs on CRE data sets produced in HepG2 cells, a well-characterized, if imperfect, proxy for hepatocyte biology. We note that the quantification of eQTL SNP-CRE overlap enrichments is inherently conservative, given that the boundaries of most genomically defined CRE types are imprecise and that eQTL SNPs are typically tag SNPs, rather than the exact causal variants.”
At this point, there are three possibilities: The first is that the authors indeed possess the secret of “empowerment,” which enables them to speak in tongues and perform divine healing, while at the same time rendering them immune to “snakes” like me. If this is so, we should all genuflect and accept their superior authority. The second possibility is that the GWAS-ENCODE-AI crowd does not speak in tongues, but that the rest of the scientists in all branches of genomics, molecular evolution, molecular biology, genetics, and biochemistry are low-grade imbeciles (see below) deficient in their ability to comprehend the superior knowledge of the empowered. If this is so, then again, we should all genuflect and accept their intellectual supremacy.
The third possibility is that the GWAS-ENCODE-AI people are pulling a Sokal-like hoax on all of us. I, for one, would be very surprised if any of the so-called reviewers who were supposed to read the Brown, Mangravite, and Engelhardt paper in PLoS Genetics would have noticed anything bizarre in the paragraph below (which I have written in the Alan Sokal style). Will the paragraph below be out of place in the midst of the Pentecostal gibberish spewed by Brown, Mangravite, and Engelhardt in their so-called scientific paper.
“The different intuitive pictures which we use to describe CRE systems, although fully adequate for given experiments, are nevertheless mutually exclusive. Thus, for instance, the overlap between cis-eQTL SNPs and genomic features, including DHS sites, can be described as a small-scale hepatic system, having a central nucleus about which the external cytoplasm revolves. For other experiments, however, it might be more convenient to imagine that the activating chromatin and regions of transcription-factor binding are surrounded by a system of stationary SNPs whose frequency is characteristic of species. Finally, we can consider the cell from a chemical point of view. Each covariate annotation is legitimate when used in the right place, but the different covariate may be contradictory and therefore we call them mutually controlling for latent variables with respect to eQTL ascertainment.”