Re-joining sample ID's and snp information from bgen files for analysis in R
2
0
Entering edit mode
12 months ago
nreid • 0

Hey, I have some UKBB data that I have pulled a set of snps from and need to analyse in R. Due to this, i need to get the sample ID's back connected to the genotype information so that i can merge it with the phenotype file later on for analysis. Can someone point me in the general direction of doing that? Thank you so much.

snp R SNP • 409 views
0
Entering edit mode
12 months ago
Sam ★ 3.8k

You can try to convert the BGEN file into a VCF or uncompressed GEN fille and extract the SNPs / samples at the same time using QCTools developed by BGEN's author.

0
Entering edit mode

I've done this, but it appears that there are no sample identifiers in the VCF file so that the data is essentially useless in my analysis since i need to merge with phenotype data. for example when using snpStats package on a pedfile i usually get the snp of interest coupled to the patient identifier and 0,1,2 for null/het/homo at that snp which was pulled. For whatever reason i cant convert to pedfiles.

0
Entering edit mode
12 months ago
nreid • 0

EDIT: I solved my issue using the vcfR package and a little more patience reading things. reading the filtered vcf into R using: test= read.vcfR("file/path/here") then extracting with tidytest = extract_gt_tidy(x=test) yielded the results I was looking for.