Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] ROC analysis: show thresholds #3172

Merged
merged 8 commits into from
Sep 12, 2018

Conversation

matejklemen
Copy link
Contributor

Issue

Implements #3057.

Description of changes

When mouse is put over a point on ROC curve, the threshold that was used for calculation now appears. If threshold value is nan, the value does not get shown. Also, the displaying of thresholds is currently not implemented when showing individual curves.

My questions:

  • currently thresholds are displayed when mouse goes 'perfectly' over a point - should this be changed so that they get displayed if mouse is in some very small radius of the point?
  • what kind of a test do I write for this functionality (i. e. mouse events) - if any?

General comments are also welcome.

Includes
  • Code changes
  • Tests
  • Documentation

Copy link
Contributor

@ales-erjavec ales-erjavec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I enable "Show convex ROC curves" and hover over the graph I get:

-------------------------- AttributeError Exception ---------------------------
Traceback (most recent call last):
  File "/Users/aleserjavec/workspace/orange3/Orange/widgets/evaluate/owrocanalysis.py", line 614, in _on_mouse_moved
    pts = sp.pointsAt(act_pos)
AttributeError: 'PlotCurveItem' object has no attribute 'pointsAt'
-------------------------------------------------------------------------------

@@ -561,6 +565,7 @@ def _setup_plot(self):
graphics = curve.avg_threshold()
self.plot.addItem(graphics.curve_item)
self.plot.addItem(graphics.confint_item)
graphics.curve_item.scene().sigMouseMoved.connect(self._on_mouse_moved)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connect the signal once in the __init__ method:
self.plotview.scene().sigMouseMoved.connect(self._on_mouse_moved)


thresh = curve_pts.thresholds[idx_closest]
if not numpy.isnan(thresh):
item = pg.TextItem(text="{:.3f}".format(thresh), color=(0, 0, 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not QToolTip.showText?

Copy link
Contributor Author

@matejklemen matejklemen Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because I am new to QT and was just following a few resources online (I've changed it now).
I've fixed most of the things that you've mentioned and some other bugs.
Thank you for the helpful comments!

@codecov-io
Copy link

codecov-io commented Jul 31, 2018

Codecov Report

Merging #3172 into master will increase coverage by 0.01%.
The diff coverage is 90.47%.

@@            Coverage Diff             @@
##           master    #3172      +/-   ##
==========================================
+ Coverage   82.74%   82.76%   +0.01%     
==========================================
  Files         344      344              
  Lines       59304    59397      +93     
==========================================
+ Hits        49073    49158      +85     
- Misses      10231    10239       +8

def _on_mouse_moved(self, pos):
curves = [crv for crv in self.plot.curves if isinstance(crv, pg.PlotCurveItem)]
for i in range(len(self.selected_classifiers)):
sp = curves[i].childItems()[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes implicitly that the order of the curves in the self.plot.curves corresponds to the self.selected_classifiers. This seems fragile.

You can use:

curves = [(clsf_idx, self.plot_curves(self.target_index, clsf_idx), self.curve_data(self.target_index, clsf_idx))
          for clsf_idx in self.selected_classifiers]  # type: List[Tuple[int, plot_curves, ROCData]]

to retrieve triples of (classifier_index, plot_curves, curve_data). Use plot_curves similarly as curve_data in ROC averaging dispatch below, but use

plot_curves.merge().curve_item
plot_curves.avg_vertical().curve_item
plot_curves.avg_threshold().curve_item

to get the pg.PlotCurveItem instances

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad - should be fixed now. I've also fixed some other bugs:

  • if mouse was hovered over a point on curve for some classifier and then the same classifier's curve would be hidden and mouse hovered over the same point as before, the cache would not reset and therefore an invalid threshold would be shown,
  • if tooltip was not being shown and the cache point was "valid" (i. e. same point as previously shown), a tooltip with message "" would be created, which resulted in hiding a tooltip.

I have also encountered the following exception (just once), which I can not seem to be able to reproduce anymore:

--------------------------- RuntimeError Exception ----------------------------
Traceback (most recent call last):
  File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 162, in mouseMoveEvent
    self.sendHoverEvents(ev)
  File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 228, in sendHoverEvents
    items = self.itemsNearEvent(event, hoverable=True)
  File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 424, in itemsNearEvent
    if hoverable and not hasattr(item, 'hoverEvent'):
RuntimeError: wrapped C/C++ object of type ScatterPlotItem has been deleted

When browsing through old issues, I've found out that there's been some instances of similar issues (#1316), though I can't say I completely understand your fix for this.

@lanzagar lanzagar added this to the 3.16 milestone Aug 3, 2018
# Find closest point on curve and display it
idx_closest = numpy.argmin(
[numpy.linalg.norm(mouse_pt - [curve_pts.fpr[idx], curve_pts.tpr[idx]])
for idx in range(len(curve_pts.thresholds))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop could be vectorized

roc_points = numpy.column_stack((curve_pts.fpr, curve_pts.tpr))
query_pt = numpy.array([[mouse_pt.x(), mouse_pt.y()]])
diff = roc_points - query_pt
idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1))

(... although a slightly moot point as the sp.pointsAt call above already runs through all the points in a python loop)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - fixed.
I've also made another minor change: now QCursor.pos() is directly used everywhere instead of using it directly in some cases and in the other cases saving it and then using the variable (just a consistency fix).

@ales-erjavec
Copy link
Contributor

currently thresholds are displayed when mouse goes 'perfectly' over a point - should this be changed so that they get displayed if mouse is in some very small radius of the point?

I think the hit radius is sufficient as it is. However it might be better to collect points from all the curves under the mouse and display a list of all thresholds with classifier names. Right now if two or more curves are close to each other it is not clear to which does the shown threshold belong to.

what kind of a test do I write for this functionality (i. e. mouse events) - if any?

I would use QTest.mouseMove targeting the QGraphicsView's viewport (plotview.viewport() in the OWROCAnalysis). Try to map one of the curve's points to scene coordinates (QGraphicsItem.mapToScene) and those to the viewport (QGraphicsWidget.mapFromScene).

@matejklemen
Copy link
Contributor Author

matejklemen commented Aug 10, 2018

I've improved the tooltips so that it is clear, which classifier a threshold belongs to. Also, the tooltip now shows thresholds for all overlapping points.

I'll see what I can do about the tests.

Edit: I'm having a bit of trouble creating the tests - so far this is what I've got:

# setup
data_in = Orange.data.Table("titanic")
res = Orange.evaluation.TestOnTrainingData(
    data=data_in,
    learners=[Orange.classification.KNNLearner(),
              Orange.classification.LogisticRegressionLearner()],
    store_data=True
)

self.send_signal(self.widget.Inputs.evaluation_results, res)
self.widget.roc_averaging = OWROCAnalysis.Merge
self.widget.target_index = 0
self.widget.selected_classifiers = [0, 1]
self.widget._replot()

# curve data for classifier 0
curve = self.widget.plot_curves(self.widget.target_index, 0)
curve_merge = curve.merge()

QTest.mouseMove(self.widget.plotview.viewport(), curve_merge.curve_item.mapToScene(0, 0))
# example assertion
self.assertTrue(QToolTip.isVisible())

The QTest.mouseMove(...) seems to work ok when manually testing - it throws the cursor to (0, 0) (in plot coordinates), but does not trigger the mouse callback in the test.

@ales-erjavec
Copy link
Contributor

The QTest.mouseMove(...) seems to work ok when manually testing - it throws the cursor to (0, 0) (in plot coordinates), but does not trigger the mouse callback in the test.

Maybe QTBUG-5232. I swear I saw this when writing https://github.com/biolab/orange3/pull/3014/files#diff-da80849ceff100f955c9ec05920ba089R138.

If I use that function and modify your test like this:

        ...
        # curve data for classifier 0
        curve = self.widget.plot_curves(self.widget.target_index, 0)
        curve_merge = curve.merge()

        view = self.widget.plotview
        item = curve_merge.curve_item  # type: pg.PlotCurveItem
        xs, ys = item.getData()
        x, y = xs[len(xs) // 2], ys[len(ys) // 2]
        pos = item.mapToScene(x, y)
        pos = view.mapFromScene(pos)
        mouseMove(view.viewport(), pos)
        # example assertion
        self.assertTrue(QToolTip.isVisible())

then the test passes. Can you rebase on current master and use the mouseMove function from test_combobox.py (move it to e.g. Orange/widgets/tests/utils.py and import it from there)

* fix crash when 'Show convex ROC curves' is enabled
* register mouse event only once
* use QToolTip instead of TextItem for displaying thresholds
* fix newly caused pylint warning
* removed dangerous retrieval of ROC curve data
* changed tooltip caching so that it works smoother
* fixed bug where cache value was not correct and would not reset properly
@matejklemen
Copy link
Contributor Author

Thank you, that does fix the issue.

I've stumbled upon another issue though - the points which I'm trying to move the mouse to are not being mapped correctly. For example, if I try to simulate a mouse move to (x=0.0, y=0.0) and (x=1.0, y=1.0), it produces the same threshold/tooltip.

I've noticed that these two points get mapped to very similar positions inside the test (QPoint(56, 1) and QPoint(57, 2) in my case), while during "manual" testing they get mapped far apart and correctly.
My guess is that this is happening because the widget, created inside the test, is tiny, and therefore everything is getting mapped to pretty much the same coordinates - but I'm not sure how to fix it.

@ales-erjavec
Copy link
Contributor

Try adding

vb = self.widget.plot.getViewBox()
vb.childTransform()  # Force pyqtgraph to update transforms

somewhere before mapping the points.

@matejklemen
Copy link
Contributor Author

Thanks once again. 😃
I've moved the mouseMove(...) function and added tests for tooltips - hopefully I didn't mess anything up.

@@ -336,6 +336,7 @@ def __init__(self):
self._plot_curves = {}
self._rocch = None
self._perf_line = None
self._tooltip_cache = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why the tooltip cache is necessary. Can you add a comment explaining the reasoning for its use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning behind it was that instead of checking all points of a curve to find the closest one, it would first check the tooltip cache, which contains at most 1 threshold (point) per (selected) classifier.

It's hard to say if it's worth the additional 20-ish lines of code though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it already checks all points in pts = sp.pointsAt(act_pos)

Copy link
Contributor Author

@matejklemen matejklemen Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but it also does it again later (although it is vectorized): idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1)).

Considering your point, it does seem that the performance gain from tooltip cache might not be as big as I thought. Should I remove it and make the code clearer that way?

Edit: also, the tooltip cache provides a convenient way to test the contents of tooltips - the first alternative way I can think of would be to test the entire tooltip text, e.g. assertEqual("Tooltips:\n(#1) 0.400", QTooltip.text()).

@ales-erjavec
Copy link
Contributor

Should I remove it and make the code clearer that way?

It does not matter really, you can leave it.

@ales-erjavec ales-erjavec changed the title [RFC] ROC analysis: show thresholds [ENH] ROC analysis: show thresholds Sep 12, 2018
@ales-erjavec ales-erjavec merged commit eb9c1e0 into biolab:master Sep 12, 2018
@matejklemen matejklemen deleted the enh_roc_thresholds branch September 17, 2018 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants