Question: Extracting Data From Vcf File
gravatar for bioinfo
7.5 years ago by
bioinfo740 wrote:

Hi from my vcf file, I have extracted the homozygous SNPs and got a new vcf file but it has no INFO section at the header. How can I keep the INFO section untouched? My commands: more file.vcf | grep "1/1" > homo.vcf What else do I need to add in the above arguments to protect the 13 lines INFO section at the header of vcf file?

vcf gatk samtools vcftools • 4.4k views
ADD COMMENTlink written 7.5 years ago by bioinfo740
gravatar for Pierre Lindenbaum
7.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum126k wrote:

try to use a regular expression, with egrep and '^#' ( hash starting a line)

egrep '(^#|1/1)' file.vcf > homo.vcf
ADD COMMENTlink written 7.5 years ago by Pierre Lindenbaum126k

And then after you can do your previous command but make sure to append so as not to over write.

more file.vcf | grep "1/1" >> homo.vcf

Also if you know exactly how many lines are in your header that you want to keep (like say 13) you can do:

head -13 > homo.vcf to put the header in a file.

ADD REPLYlink written 7.5 years ago by DG7.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1540 users visited in the last hour