Question: Running two bedtools commands in series produces error "has non positional records"
0
gravatar for Lina F
10 weeks ago by
Lina F150
Boston, MA
Lina F150 wrote:

Hello all,

I have three samples, generated Illumina PE reads for each of them, and then mapped the reads against the reference to produce three alignments. I called SNPs and now I would like to determine the subset of SNPs that are present in sample A and B but not in C. I ran bedtools as follows:

bedtools intersect -a A.vcf -b B.vcf > A_and_B.vcf
bedtools subtract -a A_and_B.vcf -b C.vcf

Unfortunately, I get the following error message:

ERROR: file A_and_B.vcf has non positional records, which are only valid for the groupBy tool.

I checked that all my files are tab-delimited and I also ran sort -k1 on the A_and_B.vcf file but that didn't affect the output.

My questions are:

  1. How do I string these two bedtools commands together?
  2. Is there a better way to do this?

Thanks for any suggestions!

vcf bedtools • 211 views
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Lina F150

you can try something like this.

 bedtools intersect -a A.vcf -b B.vcf | bedtools subtract -a stdin -b C.vcf
ADD REPLYlink written 10 weeks ago by nitinnarwade1504200

Thank you for the suggestion! Unfortunately, this didn't work for me either :-(

ADD REPLYlink written 10 weeks ago by Lina F150
2
gravatar for finswimmer
10 weeks ago by
finswimmer5.4k
Germany
finswimmer5.4k wrote:

There is an -header option for bedtools intersect which include the header in the output file.

-header Print the header from the A file prior to results.

fin swimmer

ADD COMMENTlink written 10 weeks ago by finswimmer5.4k

Thanks for the tip, this worked and let's me avoid copying things manually!

ADD REPLYlink written 10 weeks ago by Lina F150

You should move that to answer, because that is exactly the point.

ADD REPLYlink written 10 weeks ago by ATpoint7.5k
0
gravatar for Lina F
10 weeks ago by
Lina F150
Boston, MA
Lina F150 wrote:

I found a potential workaround. First, I upgraded bedtools from v2.26.0 to v2.27.1. This changed the error message that was reported and made it easier to interpret:

Error: unable to open file or unable to determine types for file A_and_B.vcf

- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the
  expected columns (e.g., cols 2 and 3 for BED).

I took another look at the output file and I noticed while it is tab-delimited, it is missing the header (which specifies that the input is in VCF format). I manually copied the header of one of the input files to the intermediate output file. Now my second command, bedtools subtract, works.

This seems like a way forward, with the caveat being that the header I manually copied to the intermediate file is not entirely correct.

If there is a better way to do this I'd love to hear about it!

ADD COMMENTlink written 10 weeks ago by Lina F150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 640 users visited in the last hour