I was trying to scan with positions from one file through positions in second file to find, if the features are overlaping between them. file a looks like: (typical vcf entries. Many of them)
chr1    1161692 chr1uGROUPERuDELu0u832  TGCTCTTTCCAGAAACCCTCAACCCTGTACGGTCAGGAGGAAACATGGCACCTCCCCTCTGGG T   63
chr1    249174066 chr1uGROUPERuDELu0u832  TGCTCTTTCCAGAAACCCTCAACCCTGTACGGTCAGGAGGAAACATGGCACCTCCCCTCTGGG T   63
chr1    249175897 chr1uGROUPERuDELu0u832  TGCTCTTTCCAGAAACCCTCAACCCTGTACGGTCAGGAGGAAACATGGCACCTCCCCTCTGGG T   63
I have a file Pt looking like this(tab delimited):
chr1    249174065 249174067
chr1    249175897    249175899
I wrote:
 for line in a:
        line = line.strip().split()
        for row in masterlist:
            row = row.strip().split()
            w=[]
            if (line[0] == row[0]):
                f = range(int(row[1]),int(row[2]))
                w.append(line[1])
                for i in w:
                    i = int(i)
                    if i in f:
                        print line
                    else:
                        break
            else:
                break
There is a problem now.
These both entries in Pt file should be a match. But the script only reports the first ontry from Pt file. If the first entry is not matched, the output is none. I want the script to output all matches
Do you just want to extract subsets of the vcf file that are within certain regions? You could just use vcf-query
yes I do. but number of these regions is quite high. Can i pass the file with these regions to the vcf-query ?