Dear All,
I have a relative simple question but I don't know how to solve this. I want to change a SAM file to a BED file. The only thing is that I need to have a BED file with an extra column telling me how many times the tag mapped to a genomic position. The first column is the sequence, and the second column tells how many times a the sequence is present in the file. For example sequence GGGGGGGGG is present 6 times in the file on different locations.
#ID locations chromosome strand start end count
AAAAAAAAA 1 chr12 + 105579297 105579321 1
AAAAAAAAB 1 chr8 + 95642182 95642206 1
GGGGGGGGG 6 chr13 + 66975161 66975185 1
GGGGGGGGG 6 chr13 - 72592620 72592644 1
GGGGGGGGG 6 chr14 - 46332831 46332855 1
GGGGGGGGG 6 chr19 - 32540873 32540897 1
GGGGGGGGG 6 chr1 - 113777719 113777743 1
GGGGGGGGG 6 chr2 + 70297183 70297207 1
would you provide the patterns (e.g.: 'GGGGGGGGG') as an argument of the program ?
Dear Pierre Lindenbaum, Yes you are right. I think change SAM to BED is an useful first step. After that, I need a kind of counter that count how many of each #ID is present in the file and add that to the file.