Question: Why does BedTools Map operation produce all dots as output?
0
gravatar for Davide Chicco
4.9 years ago by
Canada
Davide Chicco90 wrote:

I am using BedTools Map operation to map the DNAse I signal of a cell type into some chromosome regions, by computing the mean on the third column

The command I use is the following:

$ bedtools map -a inputFile1.bed -b inputFile2.bedgraph -c 4 -o mean 1> outputFile

In the output file, I have real value for chrom1 -> chrom9, but strangely I find all dots for the other chromosome regions:

chr1    66660   66810   0.849999999999999977796
chr1    87640   87790   0.0500000000000000027756
chr1    96520   96670   0
chr1    115600  115750  115.527272727272702468
chr1    118840  118990  3.10000000000000008882
chr1    125340  125490  0
chr1    136280  136430  .
chr1    136960  137110  .
chr1    235600  235750  39.0559633027522963289
chr1    237020  237170  1.59999999999999986677

....     ....     ....     ....    

....     ....     ....     ....    
....     ....     ....     ....    

chr10   134874600       134874750       .
chr10   134876820       134876970       .
chr10   134877940       134878090       .
chr10   134878160       134878310       .
chr10   134879420       134879570       .
chr10   134897500       134897650       .
chr10   134907140       134907290       .
chr10   134915640       134915790       .
chr10   134939120       134939270       .
chr10   134939280       134939430       .
chr10   134940860       134941010       .

....     ....     ....     ....     ....     ....    

 

Do you know why this strange behavior happens?

Why don't I have all the values for chrom10...19, too?

genomics map bedtools • 1.7k views
ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Davide Chicco90
1

sounds like a sorting problem. what are the outputs of:

cut -f 1 inputFile1.bed | uniq -c
cut -f 1 inputFile2.bedgraph | uniq -c

you may need to sort one of them.

ADD REPLYlink written 4.9 years ago by brentp23k

Output of the first:

71485 chr1
  56771 chr2
  46916 chr3
  34197 chr4
  39869 chr5
  38117 chr6
  37795 chr7
  34040 chr8
  28966 chr9
  32310 chr10
  41346 chr11
  24212 chr12
  16072 chr13
   9869 chr14
   9376 chr15
  13554 chr16
  20892 chr17
   5369 chr18
  14898 chr19
  20618 chr20
  10253 chr21
  15227 chr22
  18857 chrX
   1214 chrY


Output of the second:


13545206 chr1
6057074 chr10
6891539 chr11
6386478 chr12
3187543 chr13
3284847 chr14
3873336 chr15
3957177 chr16
4492389 chr17
3169684 chr18
3796102 chr19
10769055 chr2
2781156 chr20
1850148 chr21
2028433 chr22
8502537 chr3
7832694 chr4
8125616 chr5
9465221 chr6
8314745 chr7
6241766 chr8
4687957 chr9
  14655 chrM
4504505 chrX


Any clue?

ADD REPLYlink written 4.9 years ago by Davide Chicco90

yes, a sorting problem. sort both with `sort -k1,1 -k2,2n $bed`

ADD REPLYlink written 4.9 years ago by brentp23k
1
gravatar for Pierre Lindenbaum
4.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

See option '-null' http://bedtools.readthedocs.org/en/latest/content/tools/map.html

-null      The value to print if no overlaps are found for an A interval. Default: "."

ADD COMMENTlink written 4.9 years ago by Pierre Lindenbaum123k

Thanks Pierre, but it's not that case. Overlaps are present, at least for some regions. It just put dots for ALL, and I cannot understand why....

ADD REPLYlink written 4.9 years ago by Davide Chicco90
0
gravatar for Davide Chicco
4.9 years ago by
Canada
Davide Chicco90 wrote:

Thanks to Brentp that suggested me to use the command cut -f 1 fileName1.bed | uniq -c, I noticed that the two chromosome region files were sorted in different order.

This first file was sorted alphanumerically: chr1, chr2 , chr3, ..., chr9, chr10, chr11, ...

Conversely, the second file was sorted alphabetically: chr1, chr10, chr11, ..., chr19, chr2, chr20, chr21, ...

This inconsistency between these two files messed up the BedTools Map operation. To solve this, I simply sorted alphabetically the first file, too. I used the Linux command:  sort -k1,1 -k2n file > sortedFile

Merci Pierre!

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Davide Chicco90

I am glad you were able to sort this out.  We are working on enhancements that will (in most cases, not all) detect inconsistent sorting rules and throw an error to alert the user. Hoping to have it out in the next release.

ADD REPLYlink written 4.9 years ago by Aaronquinlan11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1883 users visited in the last hour