[ENH] FDR: Calculate FDR using numpy #3625

VesnaT · 2019-02-25T08:44:07Z

Issue

Current implementation of False discovery rate is using python list and for loops and is therefore inefficient.

Description of changes

Implement FDR using numpy.

Includes

Code changes
Tests
Documentation

ajdapretnar · 2019-02-25T08:49:24Z

At first I thought this will be a great fix for Text add-on, but then I realized Text implements its own FDR! 😮 I don't think this is optimal. Would you care to check that implementation as well and perhaps replace it with this one? https://github.com/biolab/orange3-text/blob/master/orangecontrib/text/stats.py

codecov · 2019-02-25T08:53:12Z

Codecov Report

Merging #3625 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3625      +/-   ##
==========================================
+ Coverage   84.22%   84.23%   +0.01%     
==========================================
  Files         370      370              
  Lines       67482    67468      -14     
==========================================
- Hits        56834    56832       -2     
+ Misses      10648    10636      -12

VesnaT · 2019-02-25T09:07:16Z

I think those implementations were the same. So Text add-on should probably use the new one in Orange core, instead of implementing its own.

Orange/statistics/util.py

janezd · 2019-02-25T09:08:39Z

Orange/statistics/util.py


+    fdrs = (p_values * m / np.arange(1, len(p_values) + 1))[::-1]
+    fdrs = np.array(np.minimum.accumulate(fdrs)[::-1])


janezd · 2019-02-25T09:11:13Z

Orange/tests/test_statistics.py

+    def test_FDR_dependent(self):
+        p_values = np.array([0.0002, 0.0004, 0.00001, 0.0003, 0.0001])
+        np.testing.assert_almost_equal(
+            np.array([0.00076, 0.00091, 0.00011, 0.00086, 0.00057]),


Have you gotten these number independently, that is, from another source or calculated manually, not from the code itself? (I'm asking because I used to be a big sinner myself.)

I calculated those using the old function.

Orange/statistics/util.py

ajdapretnar · 2019-02-25T09:13:09Z

Agree.

janezd · 2019-02-25T09:20:09Z

Orange/statistics/util.py

-        ordered = is_sorted(p_values)
+    if p_values is None or len(p_values) == 0 or \
+            (m is not None and m <= 0):
+        return None


Cover this line in tests to get approval by codecov (and to make sure we don't forget these cases at any future refactoring).

ajdapretnar · 2019-02-25T11:17:11Z

FDR in Text add-on replaced with this implementation: biolab/orange3-text#416

janezd reviewed Feb 25, 2019

View reviewed changes

VesnaT force-pushed the np_fdr branch from 58e2d06 to 7a3e0e9 Compare February 25, 2019 09:26

FDR: Calculate FDR using numpy

4ad9603

VesnaT force-pushed the np_fdr branch from 7a3e0e9 to 4ad9603 Compare February 25, 2019 09:36

janezd merged commit 69f65f8 into biolab:master Feb 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] FDR: Calculate FDR using numpy #3625

[ENH] FDR: Calculate FDR using numpy #3625

VesnaT commented Feb 25, 2019 •

edited

Loading

ajdapretnar commented Feb 25, 2019 •

edited

Loading

codecov bot commented Feb 25, 2019 •

edited

Loading

VesnaT commented Feb 25, 2019

janezd Feb 25, 2019

janezd Feb 25, 2019

VesnaT Feb 25, 2019

ajdapretnar commented Feb 25, 2019

janezd Feb 25, 2019

ajdapretnar commented Feb 25, 2019


		fdrs = (p_values * m / np.arange(1, len(p_values) + 1))[::-1]
		fdrs = np.array(np.minimum.accumulate(fdrs)[::-1])

[ENH] FDR: Calculate FDR using numpy #3625

[ENH] FDR: Calculate FDR using numpy #3625

Conversation

VesnaT commented Feb 25, 2019 • edited Loading

Issue

Description of changes

Includes

ajdapretnar commented Feb 25, 2019 • edited Loading

codecov bot commented Feb 25, 2019 • edited Loading

Codecov Report

VesnaT commented Feb 25, 2019

janezd Feb 25, 2019

Choose a reason for hiding this comment

janezd Feb 25, 2019

Choose a reason for hiding this comment

VesnaT Feb 25, 2019

Choose a reason for hiding this comment

ajdapretnar commented Feb 25, 2019

janezd Feb 25, 2019

Choose a reason for hiding this comment

ajdapretnar commented Feb 25, 2019

VesnaT commented Feb 25, 2019 •

edited

Loading

ajdapretnar commented Feb 25, 2019 •

edited

Loading

codecov bot commented Feb 25, 2019 •

edited

Loading