In the STAR 2.7 docs, this setting is defined:
14.4 Genome Parameters ... --genomeConsensusFile default: - string: VCF file with consensus SNPs (i.e. alternative allele is the major (AF>0.5) allele)
But I am unsure of what this looks like, exactly. If there are for instance 4000 samples, presumably I wouldn't need to keep all of those samples in the VCF after filtering for sites with minor allele freq > .5? Would one just randomly select a sample to keep, in this case?
A follow on to this -- how does using the
--varVCFfile, which are variants for a specific sample, if at all? Is it unnecessary to use
genomeConsensusFile if using
Finally, which one of these (maybe both? maybe neither?) affect behavior when seeting
14.24 WASP parameters --waspOutputMode default: None string: WASP allele-specific output type. This is re-implemenation of the original WASP mappability filtering by Bryce van de Geijn, Graham McVicker, Yoav Gilad & Jonathan K Pritchard. Please cite the original WASP paper: Nature Methods 12, 1061–1063 (2015), https://www.nature.com/articles/nmeth.3582 . SAMtag add WASP tags to the alignments that pass WASP filtering