Entering edit mode
22 months ago
wrinklypalms
•
0
How do I add sample.id into the INFO column the way the below example does it?
recreate input/output data:
input:
#CHROM POS ID REF ALT INFO sample.id
1 1 1724503 rs33952351 C CA AC=2;AF=1.00;AN=2 in1
2 1 1749412 rs115507312 T G AC=1;AF=0.500;AN=2 in1
3 1 2488153 rs4870 A G AC=1;AF=0.500;AN=2 in1
4 1 24503 . C CA AC=2;AF=1.00;AN=2 in2
5 1 1749412 rs115507312 T G AC=2;AF=0.500;AN=2 in2
6 22 2488153 rsid2 A G AC=1;AF=0.500;AN=2 in2
7 1 1724503 . C T AC=2;AF=1.00;ZZ=T in3
8 1 1749412 rs115507312 T G AC=1;AF=0.500;ZZ=F in3
9 1 2488153 rs4870 A G AC=1;AF=0.500;ZZ=T in3
desired output:
#CHROM POS ID REF ALT INFO
1 1 24503 . C CA AC=2;AF=1.00;AN=2;Identified=in2
2 1 1724503 rs33952351 C CA AC=2;AF=1.00;AN=2;Identified=in1
3 1 1749412 rs115507312 T G AC=1;AF=0.500;AN=2;Identified=in1,in2,in3;ZZ=F
4 1 1724503 . C T AC=2;AF=1.00;ZZ=T;Identified=in3
5 1 2488153 rs4870 A G AC=1;AF=0.500;AN=2;Identified=in1,in3;ZZ=T
6 22 2488153 rsid2 A G AC=1;AF=0.500;AN=2;Identified=in2
my code:
uid<-paste(file$`#CHROM`,file$POS,file$REF,file$ALT,sep="_")
INFO<-strsplit(file$INFO,split = ";")
for (i in 1:6){
for (j in 1:9){
samples<-unique(file$sample.id[which(file$uid == uid[i])])
if (length(samples) > 1) {
INFO[[j]]<-c(INFO[[j]],paste0('Identified=',paste(samples,collapse = ",")))
} else if (length(samples) == 1) {
INFO[[j]]<-c(INFO[[j]],paste0('Identified=',samples))
} else {
break }
INFO[[j]]<-sort(INFO[[j]])
INFO[[j]]<-paste(INFO[[j]],collapse=";")
file$INFO[j]<-INFO[[j]]
}
}
I can't figure the logic out, and whatever I've tried resulted in this output:
[7] "AC=2;AF=1.00;Identified=in1;ZZ=T;Identified=in1,in2,in3;Identified=in1,in3;Identified=in2;Identified=in2;Identified=in3"
[8] "AC=1;AF=0.500;Identified=in1;ZZ=F;Identified=in1,in2,in3;Identified=in1,in3;Identified=in2;Identified=in2;Identified=in3"
[9] "AC=1;AF=0.500;Identified=in1;ZZ=T;Identified=in1,in2,in3;Identified=in1,in3;Identified=in2;Identified=in2;Identified=in3"