Hi! I am doing a project where I have mapped a certain reference genome to a sample (with several reads), and now have a BAM file. I now want to form clusters where overlapping and close-by regions are clustered together. I know that bedtools (bedtools merge and cluster) can do this, but I was also wondering if CAP3 (the consensus sequence tool) has any relation to this? I think as a general question as well, what is a consensus sequence, and does it group together merged files? I think the biggest issue would be also - that I want to analyze each cluster independently after, so I think I'll separate them into individual BAM files. I don't believe there is an option to do this with bedtools, so would I seperate it out with a script I write myself?
Thanks!
If you look at the alignments this is already how the aligner aligns the data. You could look at
samtools depth
and then selectclusters
based on that perhaps.It is possible to create a consensus from an aligned data file this way --> Generating consensus sequence from bam file