Bash: updating file by duplicating lines
1
0
Entering edit mode
2.5 years ago

I have a text file that contains some metrics about sequencing data as output from FastQC programm. as in the image: FastQC output, per base sequence quality

This data represents the quality of calling each base in a group of sequenced reads. The columns are: Base No. | Mean | Median | Lower Quartile | Upper Quartile | 10th Percentile | 90th Percentile

The problem is that after base No.9, each two bases are represented by one single line, which is not convenient for how I am going to manipulate this data.

Therefore, I need to update this file using bash command line to have each line representing 2 bases be split into 2 identical lines, only the number of the base is changed. Example: line before any edits:

16-17   36.65222632355253       39.0    36.0    40.0    30.0    41.0

After splitting:

16   36.65222632355253       39.0    36.0    40.0    30.0    41.0
17   36.65222632355253       39.0    36.0    40.0    30.0    41.0

and so on for all the lines representing 2 bases.

I believe this will be by a for loop; however, I do not know how this could be written in bash.

Also, I am not sure how to deal with values in the first column that are written in the form of (number-number) (i.e. 16-17) it seems that I cannot use them in the regular comparisons using ( =, > and <)

Thank you in advance.

bash linux command line for loops • 477 views
ADD COMMENT
0
Entering edit mode

edit: this is how the data in the file looks like

ADD REPLY
0
Entering edit mode

Use these directions: How to add images to a Biostars post

ADD REPLY
3
Entering edit mode
2.5 years ago
GenoMax 107k

You should run fastqc again with following option to get this data for each base.

 --nogroup       Disable grouping of bases for reads >50bp. All reports will
                    show data for every base in the read.

No manipulation needed for the files you have.

ADD COMMENT

Login before adding your answer.

Traffic: 2702 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6