Hi All, Could any one suggest a tool to count unique fragments from a given BAM file? My data is exome-seq
Any leads will be appreciated
Thanks
Hi All, Could any one suggest a tool to count unique fragments from a given BAM file? My data is exome-seq
Any leads will be appreciated
Thanks
Does your BAM file contain UMI info or is it pure alignment?
here a few general pointers (for tools) : Picard MarkDuplicates ; sambamba markdup ; (samblaster?) . If the data contains UMI tags: have a look at the UMI_tools package
Also the BBtools package has subcommands to achieve this: clumpify.sh, bbduk.sh, ...
Those are mainly to make your data "unique" , for the counting part you can use samtools or alike
a tool to count unique fragments from a given BAM file?
This can be a bit tricky. You can imagine a situation where you may have two fragments that may have some overlap but they could still be considered unique since they don't have identical sequence.
dedupebymapping.sh (https://bbmap.org/tools/dedupebymapping ) may also useful for using your existing BAM file (assuming the data is already mapped).
You could use a tool like clumpify.shfrom BBMap suite (https://bbmap.org/tools/clumpify ) to count reads and compress the file (read count is added to the fastq header) that are perfectly identical (and also allow for some mismatches). Other potential option is dedupe.sh (https://bbmap.org/tools/dedupe ).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
comment or validate your previous questions, please.
Ankit :
do also take a minute to look at what Pierre Lindenbaum asked !
(that would be much appreciated ;-) , thanks )
Sure I will look into.
Thanks