I have some DNA-seq data for amplicons. I observed very weird variants with a proportion of roughly 30% that I do not see how to explain. They are in regions of quite bad sequencing quality so I would not trust them in theory. But those same exact variants seems to occur in all clones deriving from the same parental clone... I still think that it must be a bias but I am worried.
I was advised to use a control samples. The names that popped-up were "NA12878" or "Promega reference". But I was not very lucky googling that. I did found a PhiX control that seems to be what I need but I am not 100% sure how that would work. I have found that PhiX can be used to add diversity to help the sequencer. I am not sure why this is useful either.
Could someone help me get some clarity on this please?:
-What control sample is the best/are available to use with Amplicon data?
-Should I have a control in every lanes to double check that the bias do not come up from a technical bias in a lane?
-and then what should I do with it? My first idea is to find all the variants in the control and declare that if i found those variants in my amplicon reads at the same position (within the reads), then they are not trustable and are due to technical bias ? Or is there smarter things to do?
Thanks a lot for any help!!!