Re-joining sample ID's and snp information from bgen files for analysis in R
2
0
Entering edit mode
3.5 years ago
nreid ▴ 10

Hey, I have some UKBB data that I have pulled a set of snps from and need to analyse in R. Due to this, i need to get the sample ID's back connected to the genotype information so that i can merge it with the phenotype file later on for analysis. Can someone point me in the general direction of doing that? Thank you so much.

snp R SNP • 1.1k views
ADD COMMENT
0
Entering edit mode
3.5 years ago
Sam ★ 4.7k

You can try to convert the BGEN file into a VCF or uncompressed GEN fille and extract the SNPs / samples at the same time using QCTools developed by BGEN's author.

ADD COMMENT
0
Entering edit mode

I've done this, but it appears that there are no sample identifiers in the VCF file so that the data is essentially useless in my analysis since i need to merge with phenotype data. for example when using snpStats package on a pedfile i usually get the snp of interest coupled to the patient identifier and 0,1,2 for null/het/homo at that snp which was pulled. For whatever reason i cant convert to pedfiles.

ADD REPLY
0
Entering edit mode
3.5 years ago
nreid ▴ 10

EDIT: I solved my issue using the vcfR package and a little more patience reading things. reading the filtered vcf into R using: test= read.vcfR("file/path/here") then extracting with tidytest = extract_gt_tidy(x=test) yielded the results I was looking for.

ADD COMMENT

Login before adding your answer.

Traffic: 2908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6