Question: Running two bedtools commands in series produces error "has non positional records"
0
gravatar for Lina F
4 days ago by
Lina F120
Boston, MA
Lina F120 wrote:

Hello all,

I have three samples, generated Illumina PE reads for each of them, and then mapped the reads against the reference to produce three alignments. I called SNPs and now I would like to determine the subset of SNPs that are present in sample A and B but not in C. I ran bedtools as follows:

bedtools intersect -a A.vcf -b B.vcf > A_and_B.vcf
bedtools subtract -a A_and_B.vcf -b C.vcf

Unfortunately, I get the following error message:

ERROR: file A_and_B.vcf has non positional records, which are only valid for the groupBy tool.

I checked that all my files are tab-delimited and I also ran sort -k1 on the A_and_B.vcf file but that didn't affect the output.

My questions are:

  1. How do I string these two bedtools commands together?
  2. Is there a better way to do this?

Thanks for any suggestions!

vcf bedtools • 100 views
ADD COMMENTlink modified 3 days ago • written 4 days ago by Lina F120

you can try something like this.

 bedtools intersect -a A.vcf -b B.vcf | bedtools subtract -a stdin -b C.vcf
ADD REPLYlink written 3 days ago by nitinnarwade1504170

Thank you for the suggestion! Unfortunately, this didn't work for me either :-(

ADD REPLYlink written 3 days ago by Lina F120
2
gravatar for finswimmer
3 days ago by
finswimmer3.6k
Germany
finswimmer3.6k wrote:

There is an -header option for bedtools intersect which include the header in the output file.

-header Print the header from the A file prior to results.

fin swimmer

ADD COMMENTlink written 3 days ago by finswimmer3.6k

Thanks for the tip, this worked and let's me avoid copying things manually!

ADD REPLYlink written 3 days ago by Lina F120

You should move that to answer, because that is exactly the point.

ADD REPLYlink written 3 days ago by ATpoint5.4k
0
gravatar for Lina F
3 days ago by
Lina F120
Boston, MA
Lina F120 wrote:

I found a potential workaround. First, I upgraded bedtools from v2.26.0 to v2.27.1. This changed the error message that was reported and made it easier to interpret:

Error: unable to open file or unable to determine types for file A_and_B.vcf

- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the
  expected columns (e.g., cols 2 and 3 for BED).

I took another look at the output file and I noticed while it is tab-delimited, it is missing the header (which specifies that the input is in VCF format). I manually copied the header of one of the input files to the intermediate output file. Now my second command, bedtools subtract, works.

This seems like a way forward, with the caveat being that the header I manually copied to the intermediate file is not entirely correct.

If there is a better way to do this I'd love to hear about it!

ADD COMMENTlink written 3 days ago by Lina F120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 731 users visited in the last hour