How to perform low read counts removal from excel (count table) manually ?
0
0
Entering edit mode
2.5 years ago
sankadinesh ▴ 20

I want to highlight cells/remove rows, which contain less than 5 counts in all the samples. How to do this ?

https://www.dropbox.com/s/96pw1vldjar2biv/sample.xlsx?dl=0

Regards, Dinesh

sequence sequencing gene • 1.3k views
ADD COMMENT
0
Entering edit mode

In example file, OTU 8 and OTU 9 values are wrong between right and left data. To filter rows with less than 5 in any one of the sample (for the example excel sheet), do following

  1. Create a new column (I named it less_than_five)
  2. In the new column, select the cell corresponding to second row
  3. Type =IF(COUNTIF(B2:F2,"<5")<5,1,0)
  4. Copy this cell, in rest of the cells in newly created column
  5. This would populate 1 or 0 in that column
  6. All 0 rows are the OTUs with at all samples with <5 count
  7. You can either highlight 0 rows with color of your choice or remove them totally.

Example sheet has 5 samples (columns) and 10 OTUs (rows)

excel_output

ADD REPLY
0
Entering edit mode

Hi there, Thanks a lot. Your suggestion worked like a charm. Can you please tell me what does 5,1,0 mean in the formula '=IF(COUNTIF(B2:F2,"<=5")<5,1,0)'

Regards, Dinesh

ADD REPLY
0
Entering edit mode

=IF(COUNTIF(B2:F2,"<5")<5,1,0) is a combination of two formulas.

  1. IF condition works like this: IF (condition, value if condition is true, value if condition is not true). In above case, IF condition validates output from countif. If output from countif less than 5, then value is 1, if not, it not it's 0.

  2. countif counts number of cells with cell value less than 5. In this context, countif outputs number of samples (columns) with less than 5 reads, for that OTU (row).

Logic is for any given OTU, count of samples with less than 5 reads, should not be equal to number of samples (in this case 5 - used in IF condition). If they are equal, all the samples for that OTU, have less than 5 reads.

ADD REPLY
0
Entering edit mode

HI there, One quick question, I want to highlight OTUs that contain less than <5 count across the samples. If one OTU has more than 5 counts in one sample and 0 in all other samples, I want to keep that OTU, . I am attaching the modified excel again with original data. The formula given has worked for the sample dataset but not for large data. Is there any reason ? If so, any alternatives can you please suggest. Thanks for spending your valuable time.

https://www.dropbox.com/s/96pw1vldjar2biv/sample_modified.xlsx?dl=0

Regards, Dinesh

ADD REPLY
0
Entering edit mode

Samples are 42 and example set has 5. You have used the same formula (used for 5 sample data), for a 42 sample data. For 42 samples, formula would be: =IF(COUNTIF(B2:AQ2,"<5")<42,1,0).

I have added formula and conditionally formatted with color here: https://docs.google.com/spreadsheets/d/1EpmEyPopj7T2ndDmH0e-lze6PaPyXcnl/edit?usp=sharing&ouid=116988544834045239536&rtpof=true&sd=true. (download the file, do not open with google sheets)

ADD REPLY

Login before adding your answer.

Traffic: 2457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6