I have a question regarding the length information of reads obtained from BAM files. I have converted BAM files into BED files and kept the read sequence. So, it looks something like this:
Chr 6791 7891 TCGAATATCAGGGTGCCCTCTGGCAAGGGCTTGCCCAGCGTACGTCAC - Chr 6966 7304 ATTGATGAGGGATGTGGGTGGATGGATGATGATGGAAATATGATATGC +
I always assumed that columns 2 and 3 provide information on the start and end positions of the read alignment. So, column3 - column2 is the read length. However, if I calculate the number of characters in the DNA string (column 4) with function nchar() in R, I get a different value.
Can anyone explain what I am missing?