Why when splitting file in two we add 1 line before dividing the number of lines by 2?
1
0
Entering edit mode
2.8 years ago
salamandra ▴ 390

In this tutorial there's a part in which they try to shuffle the lines of a file and then split the file into two. I do not understand why when splitting the file into 2, instead of dividing the number of lines by two they add 1 and then divide by two. Why is that?

That part of the code is:

nlines=$(samtools view merged.bam | wc -l ) # Number of reads in the BAM file
nlines=$(( (nlines + 1) / 2 )) # half that number
samtools view ${tmpDir}/${NAME1}_${NAME2}_merged.bam | shuf - | split -d -l ${nlines} - "${tmpDir}/${EXPT}" # This will shuffle the lines in the file and split it
 into two SAM files
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}00 | samtools view -bS - > ${outputDir}/${EXPT}00.bam
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}01 | samtools view -bS - > ${outputDir}/${EXPT}01.bam
split commandline bash • 636 views
ADD COMMENT
2
Entering edit mode

You can try for yourself, run the relevant part with part of the tutorial with nlines=$(( (nlines + 1) / 2 )) and nlines=$(( (nlines ) / 2 )), and see if it makes a difference.

You may also open an issue at the repo with your question. But please first try for yourself and check for differences, if any.

ADD REPLY
0
Entering edit mode

Are there an odd number of lines in the bam?

ADD REPLY
2
Entering edit mode
2.8 years ago
salamandra ▴ 390

Think I know now:

In bash it only gives the integer part of the number. If the original file has for e.g. 3 lines and we want to split in half, we want one file to have 2 lines and the other 1 line, but 3/2=1.5 and bash considers to be =1, so then the file will be split in 3 files each one with one line. If we add 1 before dividing by 2: (3+1)/2=2, so bash will put two lines into a file and the remaining 1 line in the other, as we wanted.

If the file has an even number of lines: for e.g original file has 4 lines, (4+1)/2=2.5 and bash thinks it's 2, so file will be split into files which will contain 2 lines each, so still does what we want for even numbers.

ADD COMMENT

Login before adding your answer.

Traffic: 2128 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6