Hi there!
I have a .vcf.gz file which I can only access via command line (on a linux cluster). The file is specific for chr 2 and it is quite big, so I don't know how many columns are there. I want to extract the all the columns information selected for one SNP, which I know the ID. I also need the output file to contain the same header and all columns from the original, but the info for only this snp (so I could run another code).
So far, the only thing that worked for selecting the snp (but doesn't keep the header or other columns) was:
bcftools query -i 'ID="snp id"' -f'[%SAMPLE\t%DS\t%REF\t%ALT\n]' file.in.vcf.gz > file.out.vcf.gz
I also tried:
bcftools view -i 'ID="snp id"' <file.in.vcf.gz> -o <file.out.vcf.gz>
which returned the error:
-bash: syntax error near unexpected token `newline'
I also tried this one, with the same error:
bcftools query -i 'ID="snp id"' file.in.vcf.gz > file.out.vcf.gz
Hope you can help me figure this out. I am new in bcftools, but I also read the manual for this and couldn't find anything.
Thanks!
it a problem with how you're invoking bcftools. There is something in the context we cannot see with the snippet you provided. UNLESS... are you really using the expression
<file.in.vcf.gz>
?