Question: Merging bed files
0
gravatar for maruthi
3.8 years ago by
maruthi0
Japan
maruthi0 wrote:

Hi,

I have two bed files (file 1 and file 2)....which have common SNPs along with chromosome numbers and regions. File 2 has few SNP ids (along with chromosome number, start and end sites) which are already in file 1. 

I am a completely new entry into this field. I am told I can use Bedtools to combine the above files and write a command in UNIX to perform and create one single bed file with no repetitions.

May I please know what commands I should use to combine two bed files but with no common data set in the resulting bed file ?

I will eagerly wait for your kind reply.

Thank you,

Maruthi.

next-gen • 4.7k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by maruthi0
1
gravatar for Alex Reynolds
3.8 years ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

May I please know what commands I should use to combine two bed files but with no common data set in the resulting bed file ?

This seems like a different question. You can use BEDOPS bedops --not-element-of to remove elements from one BED file that overlap those in a second BED file:

$ bedops --not-element-of 1 first.bed second.bed > answer.bed

In this example, the file answer.bed contains elements exclusive to the file first.bed.

I recommend using BEDOPS sort-bed to sort BED files to use with BEDOPS tools. It runs faster than GNU sort and has fewer restrictions than other tools:

$ sort-bed unsorted.bed > sorted.bed
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Alex Reynolds28k

Thank you Alex. I will try your suggestion as well. Thank you.

ADD REPLYlink written 3.8 years ago by maruthi0
0
gravatar for tiago211287
3.8 years ago by
tiago2112871.1k
USA
tiago2112871.1k wrote:

As in the website of Bedtools: http://bedtools.readthedocs.org/en/latest/content/tools/merge.html

"bedtools merge requires that you presort your data by chromosome and then by start position (e.g., sort -k1,1 -k2,2nin.bed > in.sorted.bed for BED files)."

then you can 

bedtools merge [OPTIONS] -i <BED/GFF/VCF/BAM>

after merge all files you want to remove duplicates,

I found a discussion about this here: https://groups.google.com/forum/#!topic/bedtools-discuss/2o7oUgBwebw

As  they suggested, you must define duplicates.

[1] "Want you to remove entries where _every_ column is identical?" (easy) or

[2] "Want you to remove entries where the coordinates are the same but for example, the names are different?" (not easy)

I suggest you to look at the link and see the entire discussion but, it seems that you could on the bash shell use this:

sort -k1,1 -k2,2n -k3,3n -u <BED> > output_sorted_uniq.bed ( and this can remove duplicates)

 

 

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by tiago2112871.1k

Sorry, seems I clicked at the wrong comment place. Tiago, may I please know what the ''u, k1 and k2,2n'' in the command line are ? Thank you.

ADD REPLYlink written 3.8 years ago by maruthi0
0
gravatar for maruthi
3.8 years ago by
maruthi0
Japan
maruthi0 wrote:

Thank you Tiago. I have created a bed file with repeats by combining two. Now, I have to remove the repeats/duplicates. I will go through the link you provided and will see how far I can figure it out. Thank you

ADD COMMENTlink written 3.8 years ago by maruthi0

You 're welcome. Just one thing, when posting things that aren't an answer, use the gray, little, [add comment] button. Save the big green [Add answer] box for only answers.

Good luck.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by tiago2112871.1k

Thank you for letting me know about add comment. Tiago, may I know what u, k1 and k2 in the command line are ?

ADD REPLYlink written 3.8 years ago by maruthi0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1990 users visited in the last hour