Entering edit mode
22 months ago
arshad1292 ▴ 90
I have a kraken 2 output (k2report.txt file that looks like this:
What I want to do is to extract only number of reads for example for Genus i.e. G with taxid 2316020, 2719313, 207244 and 572511 and so on. I am not interested in D, O or R etc.
I have a large file with many hundred Genuses. Does anyone has any shell/python script that I could use to extract only Genus abundance (number of reads) for sample1, sample2 and sample3?
I would really appreciate your help.
Do you mean you want to subset your matrix
where lvl_type == "G"?
If yes then you can use
grep "\tG\t" input-file.
Yes that's correct that I want a subset of matrix that contains only "G".
I tried your script but it produced nothing...
Actually, I assumed
tab (\t)as the field separator.
If it is the filed separator and is still not working, you should probably add
grepcommand. Something like this.
Ok this one works. Thanks a lot!