Question: Filter VCF with txt
0
gravatar for leeandroid
20 months ago by
leeandroid90
leeandroid90 wrote:

Hi everyone.

I'm trying to filter a vcf with a txt that has only the ID column but I keep getting the following error message: [E::bcf_sr_regions_init] Could not parse the file list.txt, using the columns 1,2[,-1] Failed to read the targets: ^list.txt

I've been using the following command: bcftools view -T ^list.txt my.vcf > vcf_filtered

Is it possible to do such operation or should my txt have more fields? Keep in mind that my goal is to only keep the snp's listed in the txt.

Thank you in advance.

filtering snp vcf • 938 views
ADD COMMENTlink modified 20 months ago by Alex Reynolds28k • written 20 months ago by leeandroid90
2
gravatar for Alex Reynolds
20 months ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:
$ grep ^# snps.vcf > snps.header.vcf
$ grep -F -f list.txt snps.vcf > snps.filtered.noHeader.txt
$ cat snps.header.vcf snps.filtered.noHeader.txt > snps.filtered.withHeader.vcf

If you want to be fancypants and not waste time making intermediate files, this would be faster:

$ cat <(grep ^# snps.vcf) <(grep -F -f list.txt snps.vcf) > snps.filtered.withHeader.vcf

If you want to make it yet faster:

$ LC_ALL=C
$ cat <(grep ^# snps.vcf) <(grep -F -f list.txt snps.vcf) > snps.filtered.withHeader.vcf

It's probably unlikely that VCF files contain Unicode characters, and so limiting the character set to ASCII will make pattern matching with grep much faster.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Alex Reynolds28k
1

Thank you, Alex. It worked!

ADD REPLYlink written 20 months ago by leeandroid90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 738 users visited in the last hour