Can anyone guide me or share what parameters should be considered for filtering RNAseq variants and why? I haven't got any published material on it as nothing is standardised for this. I am working on a non-model organism (water buffalo) for which there is no truth set data (such as dbsnp, although dbsnp data is itself questionable). If anyone has come across on any document mentioning about RNAseq variant filtering criteria, I would be grateful if it can be shared with me.
I have made some distribution graphs but I am unable to decide a threshold based on that. Parameters chosen- QUAL, DP, GQ (Genotype quality), SP (phred scaled strand bias P value)
PS- Variants called using bcftools mpileup and bcftools call. Kindly let me know if any more information is needed for this question