So it's accepted that reads with reference allele have better chances to align to the reference genome, and this bias will cause artifacts in estimating allelic expression. Surprisingly in my data I found no reference bias, at all. The average reference ratio (=ref counts/total counts) across all heterozygous loci is 49.5%. My RNA-Seq reads were aligned to reference by using STAR, and then variants were identified by following GATK best practices. When I calculated the reference ratio at each locus, I used the read counts directly from the vcf output.
Does this make any sense to you? Maybe I did something wrong? My thought is, variant calling favors the loci with alternative allele, maybe this will cause reference bias reduced.
Hope to hear your insight! Thank you!