Hi all,
I have OTU list with 73 samples (columns) and now I want to extract top 10 for each column including the respective annotation (genus level) provided at the end. This data shows a few only but actually i have hundreds of rows. The data looks like this (subset). Can you please help how to do this?
ID-a    ID-b    ID-c    ID-d    ID-e    ID-f    ID-g    Genus
3709    5664    54  11  5   2   0   Methylocystis
1518    399 3   0   0   0   0   Kineococcus
15215   4016    46  3   2   0   0   Curvibacter
5   0   1   1   1   1   0   Sulfuritalea
13098   4699    47  5   3   13  10  Bradyrhizobium
5565    6451    2101    2774    1513    203 120 Dyella
The output would read like, separate arrangements of OTU annotation for each column (highest on top). And annotation may come next to each column. This can be done manually by sort function in excel but I would appreciate to know a quick smart way of doing this. Thanks
While it may indeed be possible to do this automagically in excel it will require VB scripting or something similar. You will likely get pushback on this for trying to use excel to do bioinformatics. It is not a great idea to do so.
You may want to export the data from excel into a delimited (comma or tab) format. At that point you can use unix
sort(by column) to extract the info you need withcutorawk.