Question: Intersect counts with fragment regions
0
gravatar for dzisis1986
16 months ago by
dzisis198620
dzisis198620 wrote:

I have a fragments file ( fragments_chr.bed ) like this :

 chr1   0   7506
chr1    7512    10943
chr1    10949   13169
chr1    13175   20064
chr1    20070   28272
chr1    28278   29959
chr1    29965   36598
chr1    36604   37512
chr1    37518   40359
chr1    40365   48795
chr1    48801   50660
chr1    50666   52359
chr1    52365   52850
chr1    52856   54447
chr1    54453   54521
chr1    54527   54816
chr1    54822   64009
chr1    64015   66453
chr1    66459   66724
chr1    66730   66943
chr1    66949   70664
chr1    70670   74343
chr1    74349   77084
chr1    77090   83123
chr1    83129   87901
chr1    87907   90967

and the results of my analysis ( counts.bed ) in counts like this

chr1    7506    7512    346
chr1    37537   37543   458
chr1    77108   77115   844
chr1    83121   83129   1629
chr1    83148   83155   1779

I would like to create a new file in order to collect all read counts into my fragments. By using bedtools intersect i ve noticed that i am losing some counts mainly those which are at the begging fragment of each chromosome. When i am running bedtools intersect like this

bedtools intersect -a fragments_chr.bed -b counts.bed -wa  -wb > intersect1_1_w4cseq_test.txt

I have this result :

chr1    37518   40359   chr1    37537   37543   458
chr1    77090   83123   chr1    77108   77115   844
chr1    77090   83123   chr1    83121   83129   1629
chr1    83129   87901   chr1    83148   83155   1779

In that case the count in chr1 7506 7512 346 is missing.

Any suggestion how to solve it or any other way to do it ?

Thank you

Dimitris

ADD COMMENTlink modified 16 months ago by Alex Reynolds29k • written 16 months ago by dzisis198620
0
gravatar for Yujin Kim
16 months ago by
Yujin Kim10
Nice, FRANCE
Yujin Kim10 wrote:

First of all, I believe you forgot the -wa and -wb options that create your final output. Second the chr1 7506 7512 346 coordonates don't intersect with any positions in your fragments_chr.bed file. So it won't appear in your output.

ADD COMMENTlink written 16 months ago by Yujin Kim10

YEs sorry for the mistake i forgot the -wa -wb i will add it . Ok how can we do it in order to appear ? Because even if this doesnt overlap we have 2 fragments in the fragments_chr.bed file with thoose coordinates.

chr1   0   7506
chr1    7512    10943

and it should be somewhere right ? Otherwise we are loosing counts ..Do you know any other way apart from intersect in order to add my counts in those ranges of chr positions ?

ADD REPLYlink modified 16 months ago • written 16 months ago by dzisis198620

bed files are zero based.@ OP

ADD REPLYlink written 16 months ago by cpad011212k

I see no way how this should work. As Yujin Kim says this two fragments have no overlap with your counts.bed.

Take care that the bed file uses half-open intervals, which means that the start position (column 2) is included but the end position (column 3) is not.

Let's have a look at the first line of fragments_chr.bed:

chr1   0   7506

This means, the fragment includes the position 0 - 7505, but not 7506.

And the first line of counts.bed:

chr1    7506    7512    346

This means the region start at 7506 (and so 1 position after your fragment) and includes all positions up to 7511.

fin swimmer

ADD REPLYlink modified 16 months ago • written 16 months ago by finswimmer12k

So this count doesnt belong anywhere ? it cant go either to chr1 7512 10943 or to chr1 0 7506. In this way we can not see with intersect the numbers of reads per fragment. Or the numbers of reads per fragment are only those which means some of your mapped reads are not included in the fragment.

ADD REPLYlink written 16 months ago by dzisis198620
1

in case of doubt, add -loj option to bedtools function above. It would print all the positions with and without overlap.

ADD REPLYlink written 16 months ago by cpad011212k
0
gravatar for Alex Reynolds
16 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

Another approach via BEDOPS bedmap:

$ bedmap --echo --echo-map-id --delim '\t' fragments_chr.bed counts.bed > answer.bed

In cases where elements of fragments_chr.bed do not overlap counts.bed, there will be an empty string after the fragment interval. Add --skip-unmapped to this command to filter these results.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Alex Reynolds29k
0
gravatar for Alex Reynolds
16 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

If you want to merge the interval space in counts.bed with fragments_chr.bed, then you could do a symmetric difference operation on the two inputs, union the difference result with fragments_chr.bed, and map that set back to counts.bed to be guaranteed coverage:

$ bedops -s fragments_chr.bed counts.bed | bedops -u - fragments_chr.bed | bedmap --echo --echo-map-id --delim '\t' - counts.bed > answer.bed

You're adding disjoint intervals to the fragments_chr.bed file, in this case, but it will "fill in gaps" before mapping, if that is your aim.

It just depends on what you want to do, but Unix streams will help you easily join together a bunch of BEDOPS set operations.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Alex Reynolds29k

Thank you Alex this probably helps a lot . I will check it and i will come back with the result !

ADD REPLYlink written 16 months ago by dzisis198620

What this is doing is to create a new file similar to counts.bed but at the end is pasted the fragments_chr.bed ,maybe i didnt understand exactly what you mean by symmetric difference operation on the 2 inputs.

ADD REPLYlink written 16 months ago by dzisis198620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 796 users visited in the last hour