I am working with some bam files, and trying to retrieve a fasta sequence from them.
I have sorted my workflow, but i have an inquiry since the size of the files doesn't seem to match, i describe my issue as follows:
My original bam file has 2Gb size, when i sort it in order to retrieve a sequence with the "samtools view" tool, like this:
samtools sort inputfile.bam -o inputfile.sort.bam
the resulting file is 390Kb in size.
Is that normal? , i have checked a region of the genome with the "less" argument, and it does have information, still, i am hesitating on wheather the sorted file is truncated, since i think it should measure 2Gb as the original file.
Anyone with any idea what may be happening, or if my info is to be relied on ?