I have a vcf file with 23 chromsomes and other unwanted contigs. I want to extract a VCF file with chromsome 1 to chromsome 5 in one file. I want to include the header line as well. How can I do this in the most efficient way? Thanks
SITE FILTERING OPTIONS
These options are used to include or exclude certain sites from any analysis being performed by the program.
Includes or excludes sites with indentifiers matching <chromosome>. **These options may be used multiple times to include or exclude more than one chromosome.**
This will preserve the header of course. In addition, the code posted above in the comments will also get the header as it is getting lines with # as well as chr[1-5] (the statement includes an or that will grab lines starting with # or with chr1, chr2, chr3, etc.
Keep in mind that the posted solution only works for single-digit chromosomes, so chr1, chr2, chr3 (...), but not chr10-22 and X. Using chr[1-22] will also not work, as you have to specify to search for double digits.
If you want all regular chromosomes, so 1-22 and X, but discard U, random contigs and stuff from a VCF, use: