Question: How do I remove files that only contain a header OR get tabix not to output a file when the input does not have SNPs in the specific region
0
gravatar for kyle
16 months ago by
kyle0
kyle0 wrote:

Hi, I am new to bioinformatics so I apologise if this is an obvious problem that I have missed. I could not find a similar problem online.

I have over 400 vcf files from different Solanum species and I have used tabix to extract my region of interest out of those files. I made a script to run through all of my files. Here is an example of what it looks like:

FILES=~/Location/*.vcf.gz
for f in $FILES
do
        echo "Processing $f file..."
        tabix -fh $f ch01:1000000-5000000 > $f.my_gene.vcf
done

Now I have 400+ new vcf files but with only my gene region. In a number of the new output files I have noticed that they contain nothing more than just the header of the original file, meaning that there were no variants in that file for my gene and are therefore not of interest to me. Firstly, is there a way I can get tabix to not output a file if there are no variants in a region? Or alternatively, how can I run through my list of files and delete those that only have a header?

Thanks, Kyle

snp tabix • 567 views
ADD COMMENTlink modified 16 months ago by Pierre Lindenbaum118k • written 16 months ago by kyle0

I have 400+ new vcf (...)meaning that there were no variants in that file

are you sure they share the same chromosome notation: chr01 != chr1 != 1 != 01 ?

ADD REPLYlink written 16 months ago by Pierre Lindenbaum118k

Yes, all files have the same notation

ADD REPLYlink written 16 months ago by kyle0
1
gravatar for Pierre Lindenbaum
16 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

use grep "for non-header-line" followed by a AND (&&) logical operator

   (...)
  tabix -fh $f ch01:1000000-5000000 | grep -m1 -v '^#' > /dev/null &&   tabix -fh $f ch01:1000000-5000000  > $f.my_gene.vcf
  (...)
ADD COMMENTlink modified 16 months ago • written 16 months ago by Pierre Lindenbaum118k

Thank you sir! Worked perfectly.

ADD REPLYlink written 16 months ago by kyle0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1180 users visited in the last hour