Question: Why when splitting file in two we add 1 line before dividing the number of lines by 2?
0
gravatar for salamandra
5 months ago by
salamandra180
salamandra180 wrote:

In this tutorial there's a part in which they try to shuffle the lines of a file and then split the file into two. I do not understand why when splitting the file into 2, instead of dividing the number of lines by two they add 1 and then divide by two. Why is that?

That part of the code is:

nlines=$(samtools view merged.bam | wc -l ) # Number of reads in the BAM file
nlines=$(( (nlines + 1) / 2 )) # half that number
samtools view ${tmpDir}/${NAME1}_${NAME2}_merged.bam | shuf - | split -d -l ${nlines} - "${tmpDir}/${EXPT}" # This will shuffle the lines in the file and split it
 into two SAM files
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}00 | samtools view -bS - > ${outputDir}/${EXPT}00.bam
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}01 | samtools view -bS - > ${outputDir}/${EXPT}01.bam
bash split commandline • 248 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by salamandra180
2

You can try for yourself, run the relevant part with part of the tutorial with nlines=$(( (nlines + 1) / 2 )) and nlines=$(( (nlines ) / 2 )), and see if it makes a difference.

You may also open an issue at the repo with your question. But please first try for yourself and check for differences, if any.

ADD REPLYlink written 5 months ago by h.mon22k

Are there an odd number of lines in the bam?

ADD REPLYlink written 5 months ago by jrj.healey10.0k
2
gravatar for salamandra
5 months ago by
salamandra180
salamandra180 wrote:

Think I know now:

In bash it only gives the integer part of the number. If the original file has for e.g. 3 lines and we want to split in half, we want one file to have 2 lines and the other 1 line, but 3/2=1.5 and bash considers to be =1, so then the file will be split in 3 files each one with one line. If we add 1 before dividing by 2: (3+1)/2=2, so bash will put two lines into a file and the remaining 1 line in the other, as we wanted.

If the file has an even number of lines: for e.g original file has 4 lines, (4+1)/2=2.5 and bash thinks it's 2, so file will be split into files which will contain 2 lines each, so still does what we want for even numbers.

ADD COMMENTlink written 5 months ago by salamandra180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1342 users visited in the last hour