How to pick top 50 the most abundant pathways?
0
0
Entering edit mode
14 months ago

Hello I have generated a file with KEGG pathway abundance table (.tsv) that contains about 450 pathways and their abundance data. However, I need to pick top 50 the most abundant pathways to make a heatmap. Here is how my pathway abundance table looks like:

pathway_name sample1 sample2 sample3 sample4 sample5 sample6
A            250     1058    491     52      691     519
B            15      542     947     1165    847     1407


and so on....

Please send me a shell script, python or R code that can help me in selecting the top 50 most abundance pathways.

pathways abundant • 471 views
1
Entering edit mode

Without knowing anything about your system, or what you want to achieve, the quickest way would be to calculate the average abundance in the samples for each pathway (e.g. rowMeans()), rank the resulting values and select the top 50.

0
Entering edit mode

Ok thanks that makes sense.