Hey, here is how I have done it in the past:
library(data.table)
library(dplyr)
temp_chrom <- vcfR::read.vcfR(vcf_path)
temp_gt <- temp_chrom@gt[,] %>% as.data.table
temp_out <- data.table(CHR = temp_chrom@fix[,"CHROM"], POS = temp_chrom@fix[,"POS"] %>% as.integer(),
REF = temp_chrom@fix[,"REF"],
ALT = temp_chrom@fix[,"ALT"])
Simplified_VCF <- cbind(temp_out, temp_gt)
Here is a snippet of the output:
> Simplified_VCF
CHR POS REF ALT FORMAT Collarless_A_val Collarless_B_val Collarless_C_val
1: CM029350.1 100 T A GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:34,0:34:93:.:.:0,93,1395 0/0:40,0:40:99:.:.:0,114,1710 0/0:39,0:39:58:.:.:0,58,1368
2: CM029350.1 106 A G GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:34,0:34:90:.:.:0,90,1350 0/0:44,0:44:99:.:.:0,120,1800 0/0:43,0:43:48:.:.:0,48,1424
3: CM029350.1 242 A C GT:AD:DP:GQ:PL 0/0:37,0:37:99:0,102,1530 0/0:47,0:47:99:0,101,1709 0/0:49,0:49:99:0,112,1800
4: CM029350.1 1509 G A GT:AD:DP:GQ:PL 1/1:0,57:57:99:2109,170,0 1/1:0,83:83:99:3042,247,0 1/1:0,70:70:99:2452,209,0
5: CM029350.1 1763 G A GT:AD:DP:GQ:PL 0/0:43,0:43:99:0,106,1422 0/0:54,0:54:99:0,106,1516 0/0:62,0:62:99:0,118,1800
---
248278: CM029350.1 27080198 G C GT:AD:DP:GQ:PL 1/1:0,47:47:99:1768,141,0 1/1:0,66:66:99:2439,198,0 1/1:0,64:64:99:2307,191,0
248279: CM029350.1 27080215 A T GT:AD:DP:GQ:PL 1/1:0,49:49:99:1816,146,0 1/1:0,59:59:99:2152,176,0 1/1:0,66:66:99:2498,198,0
248280: CM029350.1 27080272 A G GT:AD:DP:GQ:PGT:PID:PL:PS 1|1:0,44:44:99:1|1:27080272_A_G:1968,132,0:27080272 1|1:0,48:48:99:1|1:27080272_A_G:2141,144,0:27080272 1|1:0,40:40:99:1|1:27080272_A_G:1796,120,0:27080272
248281: CM029350.1 27080280 T C GT:AD:DP:GQ:PGT:PID:PL:PS 1|1:0,42:42:99:1|1:27080272_A_G:1890,126,0:27080272 1|1:0,45:45:99:1|1:27080272_A_G:2025,135,0:27080272 1|1:0,40:40:99:1|1:27080272_A_G:1796,120,0:27080272
248282: CM029350.1 27080293 G A GT:AD:DP:GQ:PGT:PID:PL:PS 1|1:0,34:34:99:1|1:27080272_A_G:1530,102,0:27080272 1|1:0,44:44:99:1|1:27080272_A_G:1980,132,0:27080272 1|1:0,36:36:99:1|1:27080272_A_G:1620,108,0:27080272
The order of rows accessed by @[fix|gt|ann] are preserved, so you can use cbind to bind dataframes. If you do not want to use data.table then replace as.data.table with as.data.frame.
Thank you for the code and idea dthorbur! I think the code to generate temp_output in the last row is missing, which is probably your way of loading the annotations in R. Do you proceed with a chrom object or just with read.csv()?
Oops, I wrote the wrong variable name. I have fixed the line in the post.