Determine the sex of a sample from a .vcf
2
0
Entering edit mode
3.2 years ago
Gl_14 ▴ 20

Hi everyone, I am doing my thesis internship and I need some help: I have the .vcf of 110 patients and I would like to determine their sex. I read it could be done taking into account the depth of coverage of the sexual chromosomes, but I have no idea on how to do it; could someone tell me if there's a software that allows me to do this, using the vcf files as input? thank you

sequencing next-gen genetics NGS vcf • 3.2k views
ADD COMMENT
4
Entering edit mode
3.2 years ago
4galaxy77 2.8k

plink --check-sex should be what you need, assuming you have data on the x-chromosome. I always prefer to recode the vcf to plink format first and then run the sex check.

https://www.cog-genomics.org/plink/1.9/basic_stats#check_sex

Id try something like this

plink --vcf data.vcf --make-bed --out data
plink --bfile data --check-sex --out data.sex_check
ADD COMMENT
0
Entering edit mode

thank you very much! I am not familiar with plink, but I'll try to use it!

ADD REPLY
0
Entering edit mode

sorry for bothering you again, but I downloaded plink 1.90 beta (build: Stable (beta 6.21, 19 Oct)) and there's no "--vcf" neither "--make-bed" option in it, how can I add them? I do not find them in the "manual" page (typing "man plink"). Thanks

ADD REPLY
0
Entering edit mode

What happens when you type plink --vcf

ADD REPLY
0
Entering edit mode

"plink: unknown option "--vcf" "

ADD REPLY
0
Entering edit mode

That means you aren't running plink 1.9 (you can also run "plink --version" to check). Check your system PATH against where you placed the downloaded program, and where any other programs named "plink" may be.

ADD REPLY
2
Entering edit mode
3.2 years ago
prasundutta87 ▴ 660

You would need the SAM/BAM/CRAM files of the patients. Just check the total number of reads aligned to the X-chromosome. Since females have two X-chromosomes and males have one, the number of reads aligned to the X-chromosome of the female samples will be roughly double to the number of reads aligned to the X-chromosome of the male samples.

Run Samtools idxstats (http://www.htslib.org/doc/samtools-idxstats.html) on your BAM/SAM/CRAM file. The output is TAB-delimited with each line consisting of reference sequence name, sequence length, # mapped read-segments and # unmapped read-segments. You need the mapped read-segments information of the X-chromosome of all the samples to perform the above described procedure.

Of course, this is just one of the many ways you can approach this problem.

ADD COMMENT
0
Entering edit mode

Unfortunately, I can only have access to the .vcf for each sample, not to the BAM files; however, if I cannot find a way to use them, I'll try to ask for BAM files access. Thank you a lot!

ADD REPLY
0
Entering edit mode

Dealing with VCFs to get depth of coverage information on sex chromosomes can be a little tricky due to the presence of pseudoautosomal regions (PAR). You can get into that if getting access to alignment files becomes difficult.

This post may be helpful as well: Determine sex from vcf file (or sequencing data)

ADD REPLY
0
Entering edit mode

Thanks a lot again!

ADD REPLY

Login before adding your answer.

Traffic: 2996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6