Dear All,
I have a relative simple question but I don't know how to solve this. I want to change a SAM file to a BED file. The only thing is that I need to have a BED file with an extra column telling me how many times the tag mapped to a genomic position. The first column is the sequence, and the second column tells how many times a the sequence is present in the file. For example sequence GGGGGGGGG is present 6 times in the file on different locations.
#ID      locations   chromosome    strand  start   end        count
AAAAAAAAA 1          chr12   +          105579297      105579321      1
AAAAAAAAB 1          chr8     +          95642182        95642206        1
GGGGGGGGG 6          chr13   +          66975161        66975185        1
GGGGGGGGG 6          chr13   -           72592620        72592644        1
GGGGGGGGG 6          chr14   -           46332831        46332855        1
GGGGGGGGG 6          chr19   -           32540873        32540897        1
GGGGGGGGG 6          chr1     -           113777719      113777743      1
GGGGGGGGG 6          chr2     +          70297183        70297207        1
would you provide the patterns (e.g.: 'GGGGGGGGG') as an argument of the program ?
Dear Pierre Lindenbaum, Yes you are right. I think change SAM to BED is an useful first step. After that, I need a kind of counter that count how many of each #ID is present in the file and add that to the file.