I have a question regarding my bam file (single-cell rna seq data). The first 6 letters of the read names of my bam file represent the cellbarcode info and next 6 represent UMI (separated by an underscore and afterwards there is a hash separating it from the rest of the name). For example-
I would like to convert these info into the CB
CB:Z:ACAGTG and UB
UB:Z:GAGAAG bam TAGs. And then the final read name becomes-
Can you please provide me with a one-liner awk command or some kind of tool using which I can do this?
Thanks a lot!