Hi, I have some mouse samples. I would like to determine sex to test for sample switch or corruption from sequencing data. I have vanila freebayes called vcf files. I figured out that there is plugin to determine sex from vcf files
However it seems to require ploidy information for given genome(see for example this problem https://github.com/samtools/bcftools/issues/175). This information should be in the format
space/tab-delimited list of CHROM,FROM,TO,SEX,PLOIDY
I am using GRCm38.82 (http://ftp.ensembl.org/pub/release-82/gtf/mus_musculus/README) for read mapping and I cannot find that ploidy information for given genome release.
I would like to ask:
- Which is most reliable way to determine sex of the sample either from bam or vcf file?
- What does exactly ploidy information mean? Are those coordinates of pseudoautosomal regions? And why this information is required for vcf2sex to work properly?
- Where can I find that ploidy information for given genome release?