Copy Sample ID from VCF file to ID column
1
0
Entering edit mode
6.5 years ago
waqasnayab ▴ 250

Hi,

I have a VCF file. If any of the sample has genotype 0/1 and 1/1, than the sample ID of such genotype should be pasted into the ID column like (I am just putting here ID and sample column):

ID sample1 sample2 sample3
rs123456 0/0 0/1 0/1 
rs789101 1/1 0/0 0/1
rs11213145 0/0 1/1 0/0

My desired output would be:

ID sample1 sample2 sample3
rs123456:sample2:sample3 0/0 0/1 0/1 
rs789101:sample1:sample3 1/1 0/0 0/1
rs11213145:sample2 0/0 1/1 0/0

Any awk or sed would be helpful, searches available to copy the contents of the column to the desired column but not for header (as header here is sample1 etc)

Thanks,

Waqas.

SNP next-gen genome sequencing • 2.1k views
ADD COMMENT
3
Entering edit mode
6.5 years ago

assuming your VCF is really a VCF file and using bioalcidae http://lindenb.github.io/jvarkit/BioAlcidaeJdk.html

 java -jar dist/bioalcidaejdk.jar  -e 'println("ID\t"+header.getSampleNamesInOrder().stream().collect(Collectors.joining("\t")));stream().filter(V->V.hasID()).forEach(V->println(V.getID()+":"+V.getGenotypes().stream().filter(G->G.isHet()||G.isHomVar()).map(G->G.getSampleName()).collect(Collectors.joining(":"))+"\t"+header.getSampleNamesInOrder().stream().map(S->V.getGenotype(S).getAlleles().stream().map(A->String.valueOf(V.getAlleleIndex(A))).collect(Collectors.joining("/"))).collect(Collectors.joining("\t"))));' input.vcf
ADD COMMENT
0
Entering edit mode

YES, Pierre, it worked for me, although just cross-checking a bit, will further confirm this command...,,,!!!!!

ADD REPLY

Login before adding your answer.

Traffic: 2047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6