Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Discretize: Simplify interface, add nicer binning #5919

Merged
merged 9 commits into from
Apr 25, 2022

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Mar 27, 2022

Issue

Resolves #5867.

To do:

  • Settings migration
  • Context settings handler
  • Tests
  • Consider renaming to "Binning"
  • Documentation
Description of changes

Screen Shot 2022-03-27 at 18 36 49

Includes
  • Code changes
  • Tests
  • Documentation

@janezd
Copy link
Contributor Author

janezd commented Mar 27, 2022

Hm, or even:

Screen Shot 2022-03-27 at 18 53 40

@markotoplak?

@ajdapretnar
Copy link
Contributor

I don't like the fact that the default setting has a variable type icon, which is somehow renamed to D. I think it is misleading. There's should be no icon, imo.

I like the redesign though. Perhaps the second one is better?

@janezd
Copy link
Contributor Author

janezd commented Mar 30, 2022

@ajdapretnar, I agree that the icon is misleading. But without any icon, the setting looks like a weird label. I think this looks OK?

Screen Shot 2022-03-30 at 18 15 00

@ajdapretnar
Copy link
Contributor

I like the star. I just would avoid visually similar icons to those we already have - so the star works perfectly.

@janezd
Copy link
Contributor Author

janezd commented Mar 30, 2022

Now for the fun part.

  1. I load Iris data. I set the default to something, say MDL. [The context includes no attributes, therefore matches anything.]

  2. I load Heart disease. Should default remain at MDL, because the settings I just had was matching? Or should it go back to default default?

  3. I set some specific discretization for some variable, and I set default to Remove.

  4. I change to Housing. Should default change to MDL, because the setting for Iris matches (on empty)? Or should it remain Remove, because that's what I have? Or should it reset to default default?

  5. I change default to Binning with 2 bins.

  6. I go back to Iris. Should default stay as it is, or change to MDL because that's what I've been doing with Iris?

These were simple questions. On Friday, we'll talk about handling different selections of variables from the same data set.

@janezd janezd added the needs discussion Core developers need to discuss the issue label Mar 30, 2022
@janezd
Copy link
Contributor Author

janezd commented Mar 30, 2022

Does anybody know a decent way to migrate a non-context setting into a context setting?

Default method (which is applied to all variables without specific setting) used to be a non-context setting. Now it is a context setting - as it should be, and also the implementation of the widget is such that this would be difficult to change. Yet, Orange migrates non-context and context settings separately; the method for migrating one of them, doesn't see others.

@markotoplak
Copy link
Member

@janezd, I think migrate settings sees contexts too (check the dict it is getting).

@janezd
Copy link
Contributor Author

janezd commented Mar 31, 2022

@janezd, I think migrate settings sees contexts too (check the dict it is getting).

Huh, thanks! I was looking at a wrong moment (in migration of default settings, which do not include contexts!) and concluded that they are somehow temporarily removed. I was already considering all kinds of hacks. Thanks!

Works like charm.

@janezd janezd force-pushed the discretization-binning branch 2 times, most recently from 2a7bad7 to ccbbe0a Compare March 31, 2022 20:57
@codecov
Copy link

codecov bot commented Mar 31, 2022

Codecov Report

Merging #5919 (4297c5a) into master (66e24de) will increase coverage by 0.02%.
The diff coverage is 97.06%.

@@            Coverage Diff             @@
##           master    #5919      +/-   ##
==========================================
+ Coverage   86.28%   86.31%   +0.02%     
==========================================
  Files         315      315              
  Lines       66899    67067     +168     
==========================================
+ Hits        57725    57890     +165     
- Misses       9174     9177       +3     

@janezd janezd force-pushed the discretization-binning branch 5 times, most recently from c30e823 to a465ca4 Compare April 1, 2022 20:12
@janezd janezd marked this pull request as ready for review April 1, 2022 20:13
@janezd
Copy link
Contributor Author

janezd commented Apr 1, 2022

@ajdapretnar, can I ask you to finish documentation? I have written the text (committed) and made screenshots (attached to this comment). Screenshots need to be stamped, saved in proper format and properly linked in the text. Thanks!

screenshots.zip

@ajdapretnar
Copy link
Contributor

Will do!

@janezd janezd removed the needs discussion Core developers need to discuss the issue label Apr 7, 2022
texts = set()
for key in varkeys:
dvar = self.discretized_vars.get(key)
fmt = self.data.domain[key[0]].repr_val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are at least two ways this can crash:

  1. self.data is None
    AttributeError: 'NoneType' object has no attribute 'domain'
    Reproduce: Add Discretize widget, open and click CC
  2. key is DefaultKey/None (see L744)
    TypeError: 'NoneType' object is not subscriptable
    Reproduce: Connect some data to Discretize widget, click CC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86b10c0 should now fix it. The button should be disabled when the user selects "Default settings", and Default settings should be selected on "Set data" (which selects it when there is no data).

@janezd
Copy link
Contributor Author

janezd commented Apr 10, 2022

@ajdapretnar, I don't see it here, I'm asking just to be sure: have you by any chance already updated the documentation and I force pushed over it? If so -- I'm really sorry for this negligence. Please write down the hash of your commit, pull my changes and add your commit by cherry picking it. If you haven't, it's OK.

@mw25
Copy link
Contributor

mw25 commented Apr 11, 2022

I would like to add a question or a wish here:

Why is the number of bins for spinboxes limited to 10? With fixed width (100?) and custom (limited by the length of the input text?) the number is much higher.

I think 10 bins is a bit too little. At least in my cases.

@ajdapretnar
Copy link
Contributor

@janezd Here it is: 621cf7a

@janezd
Copy link
Contributor Author

janezd commented Apr 19, 2022

@janezd Here it is: 621cf7a

@ajdapretnar can you backup these files, then pull my branch, cherry pick your commit and push? Again, backup these files first! (Otherwise, if you have those files, you can just recreate the commit instead of cherry picking it.)

@lanzagar, here's a puzzle for you. @ajdapretnar pushed this commit into my branch. I didn't pull it, but did some rebasing on my side and force-pushed, therefore removing her commit. I cannot cherry pick it, because I never got this commit and I don't know where to get it from -- fetching @ajdapretnar's fork doesn't do anything (it's not there, of course), nor can I get it from origin (it is not in any branch). Github, however, shows the commit, it just says it doesn't belong to any branch. Is there a way for me to get such a branchless, remoteless, forkless commit?

@lanzagar
Copy link
Contributor

I added her commit to you branch and pushed to your repo, so I either
a) made a bigger mess
b) helped
Only time will tell...

@janezd
Copy link
Contributor Author

janezd commented Apr 20, 2022

I added her commit to you branch and pushed to your repo, so I either

How did you get her commit?

You fixed it -- thanks.

@lanzagar
Copy link
Contributor

I don't know exactly what the trouble was for you, but I could cherry-pick the commit using the hash (after a standard pull from biolab's orange3 repo) so I just added it to you branch and pushed.
Maybe you never tried it, because you assumed you don't have the commit? The main orange3 repo stores some information from pull requests, so although @ajdapretnar pushed her commit to your repo (where you overwrote that branch) it was probably also floating around somewhere in biolab repository storage?

@ajdapretnar
Copy link
Contributor

ajdapretnar commented Apr 22, 2022

@janezd The macos failed tests seem not to be related to the PR (segfault and user warning not triggered on owfile). But ubuntu reports this, which does seem related:

ERROR: test_report_widgets_data (Orange.widgets.report.tests.test_report.TestReportWidgets)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/home/runner/work/orange3/orange3/.tox/orange-oldest/lib/python3.7/site-packages/Orange/widgets/report/tests/test_report.py", line 99, in test_report_widgets_data
   self._create_report(widgets, rep, data)
 File "/home/runner/work/orange3/orange3/.tox/orange-oldest/lib/python3.7/site-packages/Orange/widgets/report/tests/test_report.py", line 73, in _create_report
   w = self.create_widget(widget)
 File "/home/runner/work/orange3/orange3/.tox/orange-oldest/lib/python3.7/site-packages/orangewidget/tests/base.py", line 317, in create_widget
   widget.__init__()
 File "/home/runner/work/orange3/orange3/.tox/orange-oldest/lib/python3.7/site-packages/Orange/widgets/data/owdiscretize.py", line 525, in __init__
   self._create_buttons(box)
 File "/home/runner/work/orange3/orange3/.tox/orange-oldest/lib/python3.7/site-packages/Orange/widgets/data/owdiscretize.py", line 635, in _create_buttons
   self.button_group.idClicked.connect(self.update_hints)
AttributeError: 'QButtonGroup' object has no attribute 'idClicked'

Perhaps it's a dependency issue as the test says "Oldest dependencies"?

@ajdapretnar ajdapretnar merged commit bc52372 into biolab:master Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Discretization with nicer bins
5 participants