Question: bedtools intersect - something wrong with chromosome numbers >= 10?
1
gravatar for flyamer
5.2 years ago by
flyamer30
Russian Federation
flyamer30 wrote:

Hi!

I have an alignment (.bam) of reads to mm9 genome. I sorted it with samtools sort, so that later I can use -sorted key with bedtools. I also created a .bed-file with regions of interest, in which I want to count number of reads, that mapped to them. I tried this: converted .bam to .bed with bedtools bamtobed, and then intersected them counting number of hits (bedtools intersect -a regions_of_interest.bed -b alignment_sorted.bed -c -sorted  > Neg2H_counts.bedgraph). The problem is, it looks fine for all chromosomes with numbers from 0 to 9 (and X), but all counts for all regions of interest of chromosomes with higher number (chr10, chr11, etc) are 0. There is no biological reason for that, in fact the highest signal should be on chr11. What could be wrong here? I am fairly new to all these tools.


UPDATE
I tried to do the same intersection with bedmap and the result is identical... So there probably is something wrong with my files - what could it be?
I also tried sorting the alignment-derived bed-file in the same way, as I did with the files with regions of interest and it doesn't help.

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by flyamer30
4
gravatar for Aaronquinlan
5.2 years ago by
Aaronquinlan11k
United States
Aaronquinlan11k wrote:

By default, the -sorted option assumes that both of your input files have chromosomes sorted lexicographically (chr1, chr10, chr11, etc.). I suspect one of your files is this way and one is not.  Alternatively, if you want to use a different order, you can use the -g option. See the docs for details: http://bedtools.readthedocs.org/en/latest/content/tools/intersect.html#g-define-an-alternate-chromosome-sort-order-via-a-genome-file

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Aaronquinlan11k

I think that it would return an error in that case, wouldn't it? I sorted my files, the alignment with samtools sort before converting to bed, and the file with region of interest with sort.

ADD REPLYlink written 5.2 years ago by flyamer30

In other words, I suggest the following:

sort -k1,1 -k2,2n regions_of_interest.bed > regions_of_interest.sorted.bed
sort -k1,1 -k2,2n alignment_sorted.bed > alignment_sorted.2.bed
bedtools intersect -a regions_of_interest.sorted.bed -b alignment_sorted.2.bed -c -sorted
ADD REPLYlink written 5.2 years ago by Aaronquinlan11k

Thank you! It seems that it doesn't expect the file to be sorted as chr1, chr2, chr3, but rather as chr1, chr10, chr11, etc. I was sorting them to get the first, logical order, but after sorting as you suggested it worked!

ADD REPLYlink written 5.2 years ago by flyamer30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 810 users visited in the last hour