Interpreting "Intersectbed" Output
1
1
Entering edit mode
12.1 years ago
Atom Smasher ▴ 20

Hello,

I have been trying to interpret the output produced by the "intersectBed" tool from the BEDTools suite. I could not find any documentation related to interpreting the output.

I used the basic command for intersectBed

ie. intersectBed -a file1.bed -b file2.bed

Here are some lines for the output :

chr8    579281  579420  .       375     .       4.88051 2.88614 2.16931 79
chr1    936133  936483  .       748     .       8.58236 15.39442 12.80621  181

While the first 3 columns describe the chromosome number, the start and the stop co-ordinates, I am unable to figure out what the rest of the numbers mean.

Any help on this would be much appreciated :)

Thank you.

AB

intersect output • 7.4k views
ADD COMMENT
0
Entering edit mode

Cross-posted here.

ADD REPLY
0
Entering edit mode

What do file1.bed and file2.bed look like?

ADD REPLY
0
Entering edit mode

Hello Aaron, Both input bed files just have the chromosome number, start and stop co-ordinates. i.e

chr start stop

ADD REPLY
0
Entering edit mode

Hello Aaron, Both input bed files are just tab delimited files containing the chromosome number, start and stop coordinates.

ADD REPLY
0
Entering edit mode

try 'intersectBed -a file1.bed -b file2.bed -wa -wb'. It'll give you the overlapping entries of both files

ADD REPLY
0
Entering edit mode

Hello Aaron,

I'd like to know more about the score in the 5th column (ranging from 0 to 1000). Is that a score for the overlap between the two bed files ?

I intend to "rank order" the intersecting regions from the two bed files for my analysis to find out the regions with the most significant intersections.

Can I simply use the score to rank order the regions ? For instance, all the regions with a score of 1000 are more significant than those with a score less than 1000.

ADD REPLY
2
Entering edit mode
12.1 years ago
Gjain 5.8k

Hi Atom,

Its just the bed format. Main point here is that the start and end coordinates are the common(overlapped) coordinates for the two bed files.

So basically:

The first three required BED fields are:

  • chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
  • chromStart - The starting position of the feature in the chromosome or scaffold.
  • chromEnd - The ending position of the feature in the chromosome or scaffold.

The 9 additional optional BED fields are:

  • name - Defines the name of the BED line.
  • score - A score between 0 and 1000.
  • strand - Defines the strand - either '+' or '-'.
  • thickStart - The starting position at which the feature is drawn thickly.
  • thickEnd - The ending position at which the feature is drawn thickly.
  • itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0).
  • blockCount - The number of blocks (exons) in the BED line.
  • blockSizes - A comma-separated list of the block sizes.
  • blockStarts - A comma-separated list of block starts.

Example: Here's an example of an annotation track that uses a complete BED definition:

track name=pairedReads description="Clone Paired Reads" useScore=1
chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512
chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601

For more details, please look at this link.

You can download the BEDTOOLs manual from this link.

I hope this helps.

ADD COMMENT
0
Entering edit mode

Thanks Gjain !! :)

ADD REPLY

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6