How can I extract mostly expressed genes from a series matrix?
0
0
Entering edit mode
4 months ago
abhisek061 ▴ 30

I have a gene count series matrix I calculated which genes are expressed most with standard deviation calculation but I can not extract only those genes from thousands of extra genes into another csv file.

For reference, one gene has 7 samples I want to extract all highly expressed genes along with its expressed values for different samples.

Dataset is like-

Geneid   s1 s2 s3 s4 Standard deviation
TEA001    100         45         86           46          50
TEA000    100         45         86           44          49
TEA001    100         47         86           48           49.1

WGCNA R RNA-seq • 512 views
0
Entering edit mode

is the question technical (== how would you go about of extracting those genes) or biological (== which are the highly expressed genes) ?

for the technical part have a look at the linux utility awk (many info is available online)

0
Entering edit mode

I have studied AWK command sorry, I can't do this with AWK. Could you see the standard division column I want to filter the series matrix based on this row? how could it be possible?

0
Entering edit mode

I can't do this with awk

Unless there's some complex computation involved, you most certainly can

Could you see the standard division column I want to filter the series matrix based on this row

Do you wish to get a subset of rows (based on a column) or a subset of columns (based on a row)?

0
Entering edit mode

let's say you want to get all genes from all samples that have SD value greater than 49; (assume your file is a tab delimited)

cat yourmatrixfile | awk '{if($6>49) print}' > SD.greater49.txt$6 represents the 6th column.

2
Entering edit mode
awk '{if($6>49) print}' yourmatrixfile > SD.greater49.txt will suffice ADD REPLY 1 Entering edit mode awk '$6>49' yourmatrixfile
0
Entering edit mode

Thank you amazing peoples for help me. my problem is now solved with libre office calc.

2
Entering edit mode

That's a bad idea. You should be using tools with which you can replicate your analysis. Replication using GUI tools is not easy/straightforward, and automation is near impossible.

0
Entering edit mode