Question: bedtools closest (output file format)
0
gravatar for biostart
3.1 years ago by
biostart290
Germany
biostart290 wrote:

Hello,

Is there a way to ask bedtools to return both regions in one line (not in two lines)? See below.

I just run into a problem with bedtools closest. Here is the command:

bedtools closest -a RNA-seq.bed -b ChIP-seq-peaks.bed -d > output.bed

The RNA-seq file contains about 30 columns, starting with Chromosome, Start, End. The ChIP-seq-peaks file is in a classical bed format. Both files are sorted.

The resulting file has two lines per each initial line of the file RNA-seq.bed. The insersecting peak is added as a separate line. Is there a way to tell bedtools to not make a line break?

Thank you!

rna-seq chip-seq bedtools • 1.3k views
ADD COMMENTlink modified 3.1 years ago by Alex Reynolds28k • written 3.1 years ago by biostart290
0
gravatar for Alex Reynolds
3.1 years ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

Another option:

$ closest-features --nearest RNA-seq.bed ChIP-seq-peaks.bed > output.bed

Features are put onto one line.

Depending on the state of inputs, it may be worthwhile to sort, e.g.:

$ sort-bed < RNA-seq.unknown-sort.bed > RNA-seq.bed
$ sort-bed < ChIP-seq-peaks.bed.unknown-sort.bed > ChIP-seq-peaks.bed

Also make sure your inputs don't have any weird line endings, e.g.:

$ cat -e RNA-seq.unknown-line-endings.bed | head
$ cat -e ChIP-seq-peaks.unknown-line-endings.bed | head

If you have more than dollar signs at the end of each line ($) then use tools like sed or dos2unix to preprocess or convert files as needed.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Alex Reynolds28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1648 users visited in the last hour