Question: Why when splitting file in two we add 1 line before dividing the number of lines by 2?
0
gravatar for salamandra
10 weeks ago by
salamandra170
salamandra170 wrote:

In this tutorial there's a part in which they try to shuffle the lines of a file and then split the file into two. I do not understand why when splitting the file into 2, instead of dividing the number of lines by two they add 1 and then divide by two. Why is that?

That part of the code is:

nlines=$(samtools view merged.bam | wc -l ) # Number of reads in the BAM file
nlines=$(( (nlines + 1) / 2 )) # half that number
samtools view ${tmpDir}/${NAME1}_${NAME2}_merged.bam | shuf - | split -d -l ${nlines} - "${tmpDir}/${EXPT}" # This will shuffle the lines in the file and split it
 into two SAM files
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}00 | samtools view -bS - > ${outputDir}/${EXPT}00.bam
cat ${tmpDir}/${EXPT}_header.sam ${tmpDir}/${EXPT}01 | samtools view -bS - > ${outputDir}/${EXPT}01.bam
bash split commandline • 195 views
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by salamandra170
2

You can try for yourself, run the relevant part with part of the tutorial with nlines=$(( (nlines + 1) / 2 )) and nlines=$(( (nlines ) / 2 )), and see if it makes a difference.

You may also open an issue at the repo with your question. But please first try for yourself and check for differences, if any.

ADD REPLYlink written 10 weeks ago by h.mon20k

Are there an odd number of lines in the bam?

ADD REPLYlink written 10 weeks ago by jrj.healey7.7k
2
gravatar for salamandra
10 weeks ago by
salamandra170
salamandra170 wrote:

Think I know now:

In bash it only gives the integer part of the number. If the original file has for e.g. 3 lines and we want to split in half, we want one file to have 2 lines and the other 1 line, but 3/2=1.5 and bash considers to be =1, so then the file will be split in 3 files each one with one line. If we add 1 before dividing by 2: (3+1)/2=2, so bash will put two lines into a file and the remaining 1 line in the other, as we wanted.

If the file has an even number of lines: for e.g original file has 4 lines, (4+1)/2=2.5 and bash thinks it's 2, so file will be split into files which will contain 2 lines each, so still does what we want for even numbers.

ADD COMMENTlink written 10 weeks ago by salamandra170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1479 users visited in the last hour