Mysterious Warning From Hotspot V2
2
0
Entering edit mode
11.2 years ago

I'm using Hotspot v2 for the first time. The output includes the same warning repeated several hundred times. Can anyone with more experience of the algorithm comment on the seriousness of this warning: "Warning: only 0 of 50000 are mappable, increasing to 2..."

Here's the output for the first pass on one chromosome:

Hotspot pass 1...
Processing chrom: 1
Compute Hot Spots 
Completing HotSpot Identification
Filter Hot Spots 
Cluster Size
Warning: only 0 of 50000 are mappable, increasing to 2...
(repeat many times)
Warning: only 0 of 50000 are mappable, increasing to 2...
Completing Cluster Size
31 clusters scored using genome-wide density, avg. z = 4.07043; 114484 scored using local density, avg. z = nan
Chrom summary: 114515
• 2.4k views
ADD COMMENT
0
Entering edit mode

Newer distributions of hotspot are available on Github.

ADD REPLY
3
Entering edit mode
11.2 years ago
Bob Thurman ▴ 40

This type of warning comes when hotspot encounters tags from your data in regions where it thinks there shouldn't be. This occurs when hotspot is trying to determine the background expectation for numbers of tags in a particular neighborhood around each of your tags. It determines this background expectation by looking in the file defined by the token MAPPABLE10KBFILE in your runall.tokens.txt file. This file is supposed to give the number of bases that are uniquely mappable in every successive 10kb window in the genome.

Depending on the length of your tags (the k-mer size), some regions of the genome are not uniquely represented by a tag of that length, so you would not expect any sequencing tags to come from those regions, and the count for a 10kb window containing any of those regions is going to be less than 10,000. There are certainly some 10kb windows where the number of uniquely mappable bases is going to be zero. If for some reason you actually have tags in any of these regions, then you are going to get the message above. This could happen if the tags file you use isn't filtered for uniquely-mapping tags. Or if you are using a 10kb mappable file that doesn't match your tag length. If there is a big mismatch between the 10kb file and your tags, then the hotspot results could be misleading. If the mismatch is not so big, the hotspot results are probably still going to be fine.

The hotspot v2 distribution comes pre-loaded with these 10kb mappable files for a couple of k-mer sizes (36 and 76, maybe?) for the hg19 genome. I can easily generate a 10kb mappable file for a different k-mer size and a different genome. Send me a message if you are interested.

Good luck!

ADD COMMENT
0
Entering edit mode

Bob, thanks for the explanation. I hadn't filtered for uniquely mapping reads and our tag length is 42bp. I will send you a message about the 10kb mappable file.

ADD REPLY
0
Entering edit mode
9.9 years ago

Did you solve the issue?

I too get same warning messages. I am using Hotspot v4. I removed duplicates using Picard-tools/MarkDuplicates. Also created uniquely mappable tags using the script, "enumerateUniquelyMappableSpace.pl", given along with hotspot . I still get the warning message for some chromosomes...

ADD COMMENT

Login before adding your answer.

Traffic: 1399 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6