how extract information from the output text file?
1
0
Entering edit mode
5 months ago
sata72 • 0

I have several output files and want extract some information from them.

sample1.out
sample2.out
sample3.out
..
..

These information are there in each samples:

  3259390 reads; of these:

   3234126 (99.22%) were paired; of these:
   ----
   292091 pairs aligned concordantly 0 times; of these:
   ----
  59571 pairs aligned 0 times concordantly or discordantly; of these:
  98.90% overall alignment rate

I want to have information of all the samples as this in the output:

sample1    sample1      sample1           sample 2    sample       2 sample 2
reads       paired      overall.alignment  reads       paired      overall.alignment
3259390     3234126     98.90              3259398     3234136     98.98
R linux • 618 views
ADD COMMENT
0
Entering edit mode

You're going to have to write your own code for this, especially since your desired output format is not a straightforward one-row-per-sample table either.

If the line number is consistent and you need the first word from the 1st, 3rd 8th lines, you should just use awk to print the first word for each file where the NR matches one of those three numbers, transpose the output so entries are separated by tabs instead of new lines and generate the header manually.

ADD REPLY
0
Entering edit mode

Looks like this is bowtie* output? You may be able to run multiQC on them to summarize.

ADD REPLY
0
Entering edit mode

For the sake of the future reports when you get 100 or maybe 1000 samples: flip the columns and rows and never mix numerical values with "reads" "paired" etc in a column.

Sample names as row names, the specific values as columns.

It is trivial to sort by column values, it is way less so if you want to sort values in a row but from selected columns. No need to use a shotgun for a foot self amputation me thinks.

ADD REPLY
0
Entering edit mode

Good idea! thanks

ADD REPLY
3
Entering edit mode
5 months ago

learn awk.

use something like:

 awk '/reads; of these/ {print FILENAME,"reads",$1} /) were paired/ {print FILENAME,"paired",$1;print FILENAME,"percent",$2;}' sample.txt

to convert the file to a tabular format.

ADD COMMENT
0
Entering edit mode

it works, thanks

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6