I have couple questions related to bed file analysis and we have unlocalised regions in our bed file after the overlapping bed regions analysis.
1) I know I can write a script that will eliminate everything but chr1-2-3-4... would it be a smart idea ? ( I am totally interested in known regions) Also, Can this be the lack of mappability of hg19 ? Should I try grch38 ?
2) If it is possible how can I visualise these regions in IGV ? 3) In couple of the chr3-4-5 regions, we have observed that NNNNNNNNNNNNNN sequence in present. It is probably a sequencing error. I would like to know, if those repeating regions are kept in a document so that I can make a filtration that will eliminate those regions and give me the most confident results?
For example, assume following regions are found to be repeating or NNNNNNNNNNNN and there is list of these intervals so that I can eliminate from my bed file before I find my overlaps.
chr1 100 2500 chr1 5000 6200
Thank you very much,
a sample from my overlaps.
chrKI270466.1 637 902 89.867126 chrKI270438.1 111985 112321 82.382042 chrKI270733.1 158634 158781 80.990242 chrKI270438.1 109559 109720 77.981293 chrGL000220.1 138579 138817 56.679230 chrGL000225.1 50760 50972 47.351170 chrGL000225.1 67365 67555 35.525284