Question: Running two bedtools commands in series produces error "has non positional records"
0
gravatar for Lina F
4 months ago by
Lina F150
Boston, MA
Lina F150 wrote:

Hello all,

I have three samples, generated Illumina PE reads for each of them, and then mapped the reads against the reference to produce three alignments. I called SNPs and now I would like to determine the subset of SNPs that are present in sample A and B but not in C. I ran bedtools as follows:

bedtools intersect -a A.vcf -b B.vcf > A_and_B.vcf
bedtools subtract -a A_and_B.vcf -b C.vcf

Unfortunately, I get the following error message:

ERROR: file A_and_B.vcf has non positional records, which are only valid for the groupBy tool.

I checked that all my files are tab-delimited and I also ran sort -k1 on the A_and_B.vcf file but that didn't affect the output.

My questions are:

  1. How do I string these two bedtools commands together?
  2. Is there a better way to do this?

Thanks for any suggestions!

vcf bedtools • 279 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by Lina F150

you can try something like this.

 bedtools intersect -a A.vcf -b B.vcf | bedtools subtract -a stdin -b C.vcf
ADD REPLYlink written 4 months ago by Nitin Narwade310

Thank you for the suggestion! Unfortunately, this didn't work for me either :-(

ADD REPLYlink written 4 months ago by Lina F150
2
gravatar for finswimmer
4 months ago by
finswimmer6.9k
Germany
finswimmer6.9k wrote:

There is an -header option for bedtools intersect which include the header in the output file.

-header Print the header from the A file prior to results.

fin swimmer

ADD COMMENTlink written 4 months ago by finswimmer6.9k

Thanks for the tip, this worked and let's me avoid copying things manually!

ADD REPLYlink written 4 months ago by Lina F150

You should move that to answer, because that is exactly the point.

ADD REPLYlink written 4 months ago by ATpoint9.3k
0
gravatar for Lina F
4 months ago by
Lina F150
Boston, MA
Lina F150 wrote:

I found a potential workaround. First, I upgraded bedtools from v2.26.0 to v2.27.1. This changed the error message that was reported and made it easier to interpret:

Error: unable to open file or unable to determine types for file A_and_B.vcf

- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the
  expected columns (e.g., cols 2 and 3 for BED).

I took another look at the output file and I noticed while it is tab-delimited, it is missing the header (which specifies that the input is in VCF format). I manually copied the header of one of the input files to the intermediate output file. Now my second command, bedtools subtract, works.

This seems like a way forward, with the caveat being that the header I manually copied to the intermediate file is not entirely correct.

If there is a better way to do this I'd love to hear about it!

ADD COMMENTlink written 4 months ago by Lina F150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 639 users visited in the last hour