[ENH] Continuize: Specific options for variables #6181

janezd · 2022-10-22T16:18:22Z

Issue

Requires biolab/orange-widget-base#231.

Description of changes

~~This design kind of follows Discretize, but I'm not too happy with it.~~

~~Try selecting a bunch of variables of different kinds: both groups of radio buttons are enabled, yet clicking in one of them changes only categorical (or numeric) variables from the selection.~~

~~Allowing only single selected variable is bad from usability perspective: currently one can use filter to select variables with some common part of the name, select them all and set the option.~~

~~Maybe we should have two list views, one with categorical and one with numeric variables? And in case there are no categorical/numeric variables, the corresponding view and radio buttons are hidden?~~ Yes.

Minor point: I reduced the option for class to a single check box. It turns binary class into 0 and 1, and, in general, any class to 0..(n-1). The only alternative option is to use 0..1, but this can be achieved with normalization in a later widget.

Waiting for design confirmation before completing the below list.

Includes

Code changes
Tests
Report method
Migration of settings
Use get_column ~~and ListViewFilter instead of deprecated equivalents~~; ListViewFilter will be used after ListViewFilter: Add methods source_model orange-widget-base#223 is merged and released
Documentation

codecov · 2022-12-23T20:36:36Z

Codecov Report

Merging #6181 (fd78b8b) into master (1bec1ca) will increase coverage by 0.06%.
The diff coverage is 97.42%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6181      +/-   ##
==========================================
+ Coverage   87.44%   87.51%   +0.06%     
==========================================
  Files         317      317              
  Lines       68331    68548     +217     
==========================================
+ Hits        59754    59989     +235     
+ Misses       8577     8559      -18

markotoplak · 2023-02-09T12:09:13Z

Minor point: I reduced the option for class to a single check box. It turns binary class into 0 and 1, and, in general, any class to 0..(n-1). The only alternative option is to use 0..1, but this can be achieved with normalization in a later widget.

This PR also removes the option that can create one feature per value. Perhaps because no widgets could make use of resulting multiple classes? We can still do the same action by moving the class into features and then using Continuize on that feature and moving it back - so the omission is not a big deal.

Like in Discretize, we can specify all options for metas, but Discretize also allows all options on classes. Why not here too? Preventing users to do stupid things? The separate (single) target option is even more confusing to me.

I also saw that there is a blank line between attributes and metas (which does not exist in Discretize). I like that there is some visual separation, but the first time I saw it I thought it was a bug.

I think we should:

Put class var in the corresponding list of variables and remove the separate option below.
Remove the blank line between attributes and metas. Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

If you disagree, I can also merge this PR as-is.

janezd · 2023-02-09T16:43:29Z

Put class var in the corresponding list of variables and remove the separate option below.

If we do this, the default continuization method will also apply to class. Instead of preventing the users from doing something stupid, we'd do it ourselves, but behind their backs. :)

I think this wouldn't be OK. If you dislike a separate checkbox, I'd prefer removing it altogether. What do you say?

Remove the blank line between attributes and metas.

I haven't noticed it. This will happen always when DomainModel is used in list view, because list views do not properly paint separators. (Even for combos, Qt uses a weird hack - the model has to pass a literal string "separator" as Qt.AccessibleDescriptionRole, and painter will show a line.)

We can fix that in general. Could you, for a start just derive owcontinuize.ListViewSearch.Delegate from orangewidget.utils.combobox._ComboBoxListDelegate: do you like what it looks like?

I can of course also remove the separator from the model.

Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

I did not understand this. Where should we write about metas and classes?

I was thinking, though, that list views and combo boxes that use domain models could have a delegate that would print, in smaller letters "Attributes", "Meta Attributes", "Targets" above separators. Even more, we could allow writing arbitrary text above separators. Would you find this useful?

markotoplak · 2023-02-09T17:46:38Z

Put class var in the corresponding list of variables and remove the separate option below.

If we do this, the default continuization method will also apply to class. Instead of preventing the users from doing something stupid, we'd do it ourselves, but behind their backs. :)

I think this wouldn't be OK. If you dislike a separate checkbox, I'd prefer removing it altogether. What do you say?

I think we need to control whether to Continuize the class var or not, so if we do not mix it into the features, we need to keep the checkbox.

But doesn't Discretize, where the class goes into the features, suffer from a similar problem then? A possible solution would be not to automatically apply default options to the classes. They are special enough so that a user could consider them separately.

Remove the blank line between attributes and metas.

I haven't noticed it. This will happen always when DomainModel is used in list view, because list views do not properly paint separators.

Oh, I understand, it is OK. I compared this widget and Discretize and Discretize did not have it. With Discretize I have a related problem: you can not see which are features, which are metas, and which are classes.

Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

I did not understand this. Where should we write about metas and classes?

I though about just adding a string - either "target" or "meta" - after each variable name in the list view. This PR adds continuization type after the name anyway so it is likely a cheap change.

I was thinking, though, that list views and combo boxes that use domain models could have a delegate that would print, in smaller letters "Attributes", "Meta Attributes", "Targets" above separators. Even more, we could allow writing arbitrary text above separators. Would you find this useful?

That would surely be an improvement of separators in this context. I don't particularly like them; I'd rather see the type for each variable separately (what is there are more metas that fit in the current view?). If space is an issue, perhaps a text "META" or "TARGET" written over the feature type icon, diagonally, so that it does not take any more space? (I understand the proposal is probably ugly and too confusing, so just an idea.)

Oh, long ago we discussed changing feature type icons but then just forgot about it. Now they are so confusing (C for Categorical or Continuous?, N for Numeric or Nominal?) that they do not mean anything. Even colored squares would convey the same meaning. Then we could even easily write META or TARGET onto the squares too. ;)

janezd · 2023-02-09T18:05:42Z

I think we need to control whether to Continuize the class var or not, so if we do not mix it into the features, we need to keep the checkbox.

Then I'd prefer keeping the checkbox. If you find it confusing, we can rephrase it.

But doesn't Discretize, where the class goes into the features, suffer from a similar problem then? A possible solution would be not to automatically apply default options to the classes. They are special enough so that a user could consider them separately.

But then the default would apply to all in the list except for one...

I though about just adding a string - either "target" or "meta" - after each variable name in the list view.

This would indeed be cheap, but it would add a lot of visual noise at little value. Why do you care at all whether a variable is an attribute or a meta - in the context of this widget?

Attributes and metas are grouped (unless we would sort the variables, which we never do - and I think we shouldn't), which is why adding a label before the group is imho better than repeating the same label after every name in the first part, and another in the second part. For icons, it would be the same - one type in the first half, another in the second. Furthermore, most data would have just one single type of icon. Plus, I like this icons and find them useful. (I learned that red is numeric and green is categorical, actually.)

markotoplak · 2023-02-09T18:23:36Z

Plus, I like this icons and find them useful. (I learned that red is numeric and green is categorical, actually.)

Exactly what I was trying to point out about them. I'd absolutely keep the colors but the current lettering (C or N) only makes them confusing if you read them wrongly.

markotoplak · 2023-02-09T18:42:08Z

But then the default would apply to all in the list except for one...

Well, yes, almost. Now that you pointed it out, I do not think applying the default choice to metas makes any sense either.

Why do you care at all whether a variable is an attribute or a meta - in the context of this widget?

In the context of this widget I care a lot. I would not normally want to continuize metas or classes unless I'd explicitly decide for it. And even then, only certain metas would be continuized. So clearly showing what operation goes where would be a priority for me.

Therefore, metas and classes seem approximately equal to me in the context of this widget: none should be attacked by default (or even connected with default) and both could have the same interface. Separating between them in terms of how can I modify them seems very artificial. Showing what I am dealing with, a necessity.

I agree that added strings are additional visual noise, but are clearer. Better separators could be a possible solution with other compromises.

janezd · 2023-02-09T18:54:19Z

Yes, but I still think that using having a long list of A's in almost every list view would look stranger than a mixture of C's and N's. Also because I mostly don't care.

What about this?

Implementation is simple and widgets would mostly just need to pass an additional flag to DomainModel (or specify a different order) to indicate they want this. But I would only use it where relevant, though. I'll be on zoom tomorrow, but you can explain why do you care about distinguishing attributes from metas so much. To me, it's mostly visual clutter.

janezd · 2023-02-09T18:59:38Z

N.B. After seeing the last screenshot, in which I resized the widget, I added stretch below radio buttons, so they don't spread. :)

janezd · 2023-02-09T19:05:36Z

My comment (Yes, but...) did not refer to your last comment ( #6181 (comment)). Now I understand why you care about metas here. If so, we should have separate defaults for attributes and for metas? Doable, though I'd hate doing it. Or, as you said, we can make it clear that default applies only to attributes, by renaming it to "Default for ordinary attributes".

markotoplak · 2023-02-09T19:19:27Z

Now I understand why you care about metas here. If so, we should have separate defaults for attributes and for metas? Doable, though I'd hate doing it.

I agree, that would not work well. I saw firsthand that even this default selection was already confusing to some, imagine if we had add another.

Or, as you said, we can make it clear that default applies only to attributes, by renaming it to "Default for ordinary attributes".

As it is now, any kind of continuization is possible so I am only guessing what I think would be the least confusing. I guess applying operations default operations to metas is confusing. Therefore, the "Default for ordinary attributes" suggestion makes the most sense to me.

And if we decide to go that way, classes could naturally come into attribute lists, which solves my second source of confusion. :)

But these are only my guesses.

markotoplak · 2023-02-09T19:20:52Z

What about this?

Implementation is simple and widgets would mostly just need to pass an additional flag to DomainModel (or specify a different order) to indicate they want this. But I would only use it where relevant, though.

I like it and agree we should only use it where the distinction is informative.

janezd · 2023-02-10T19:44:19Z

@markotoplak, I'm awaiting your comments before fixing and adding tests.

janezd · 2023-02-10T20:04:26Z

You may also try it with and without c10d5e7. With this commit, labels do not flash it the user merely passes over the widget with a mouse.

We may also want to disable tooltips in this case.

markotoplak · 2023-02-17T14:04:36Z

I like the tooltip shown as a better-annotated list. The tooltip delay looks good.

The current version also applies the chosen preset to the metas by default. I like that the Targets are treated differently, and I would prefer the same treatment for metas. Because I know you already had a version doing so I imagine you have a good reason to go back to previous behaviour.

If metas get the preset applied as a default behaviour, we need to migrate settings carefully, because the previous version of the widget just ignored the metas and passed them through as they were. (I tried testing migration just to confirm what happens in master - I fully expected this not to work the same - but then I even got an error when opening the workflow.)

If separate behaviour for attributes and metas is too problematic despite the new tooltip-like descriptions (I think is is fine - if it works for classes...), I would prefer that this widget completely ignores metas rather than continuizing them as set in the preset.

markotoplak · 2023-03-24T14:39:42Z

@janezd, I added two commits, the first fixing some messages and the second fixing migration.

Migration was wrong, and some tests that actually opened old workflows were failing. The problem was that the old settings described some index into a list that did not follow the order from Continuizer.MultinomialTreatment. Therefore I added lists to map old indices to new ones.

Please double-check if what I did makes sense. If you agree, just merge.

janezd · 2023-03-26T18:20:18Z

Thanks, @markotoplak.

I would merge this after we release Slovenian Orange (tomorrow, hopefully); not so much because of possible bugs but because it could requires new translations. And possibly also fixes in tests, like those is #6373 or biolab/orange-translations#60.

janezd · 2023-03-30T08:41:22Z

@markotoplak, I rebased to resolve conflicts, but forgot to fetch your commits before doing that. Can you please reapply them?

Interestingly, tests on old workflows failed because of this(?), but only on PyQt6?!

janezd force-pushed the continuize-per-variable branch 3 times, most recently from fdb3c6a to b29d6b0 Compare October 24, 2022 18:22

janezd self-assigned this Dec 23, 2022

janezd changed the title ~~[RFC] Continuize: Specific options for variables~~ Continuize: Specific options for variables Dec 23, 2022

janezd force-pushed the continuize-per-variable branch from b29d6b0 to 3b267e4 Compare December 23, 2022 20:20

janezd marked this pull request as ready for review December 23, 2022 20:21

janezd removed their assignment Dec 23, 2022

janezd force-pushed the continuize-per-variable branch from 3b267e4 to 1da40e0 Compare December 23, 2022 22:20

janezd assigned markotoplak Jan 6, 2023

janezd force-pushed the continuize-per-variable branch from 1da40e0 to fdaea78 Compare February 9, 2023 18:57

janezd force-pushed the continuize-per-variable branch 3 times, most recently from 3592756 to 3834f8e Compare February 12, 2023 21:45

janezd force-pushed the continuize-per-variable branch from d8c0033 to e66233d Compare February 18, 2023 16:45

janezd force-pushed the continuize-per-variable branch 5 times, most recently from 625b9cf to 27df8e7 Compare March 23, 2023 22:32

janezd added 16 commits March 30, 2023 09:50

Identity, Indicate: Add __eq__ and __hash__

b27265f

Continuize: Add specific options for variables

f28fea6

Continuize: Split views for numeric and categorical

22cb74a

DomainModel: Add stricter type checking

96f7def

Continuize: Refactoring, tooltips etc

6a9c4ca

Continuize: PyLint

3f00f2e

Continuize: Tets and fixes; migration

d736110

Continuize: Update documentation

9259297

Continuize: Send report

a8e6b94

Continuize: Split lists into sections, add class vars

386de84

DomainModel: Support LabelledSeparators

0728382

Continuization: Add Reset button

50e25de

Continuization: Delay before showing default behavior

402e2b8

Fix tests for changes, and changes for tests

b74bdc7

Continuize: Minor fixes

6d757e5

Continuize: Tests and small fixes

55dfd73

janezd force-pushed the continuize-per-variable branch from aea2291 to 55dfd73 Compare March 30, 2023 07:53

markotoplak added 2 commits March 30, 2023 16:10

Fix RemoveMultinomial description

845e1c7

Fix setting migration

fd78b8b

markotoplak mentioned this pull request Mar 30, 2023

Our tests fail to test workflows #6379

Closed

markotoplak changed the title ~~Continuize: Specific options for variables~~ [ENH] Continuize: Specific options for variables Mar 30, 2023

markotoplak merged commit 45a1eec into biolab:master Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Continuize: Specific options for variables #6181

[ENH] Continuize: Specific options for variables #6181

janezd commented Oct 22, 2022 •

edited

Loading

codecov bot commented Dec 23, 2022 •

edited

Loading

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023 •

edited

Loading

janezd commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023 •

edited

Loading

markotoplak commented Feb 9, 2023

janezd commented Feb 10, 2023

janezd commented Feb 10, 2023

markotoplak commented Feb 17, 2023 •

edited

Loading

markotoplak commented Mar 24, 2023

janezd commented Mar 26, 2023

janezd commented Mar 30, 2023

[ENH] Continuize: Specific options for variables #6181

[ENH] Continuize: Specific options for variables #6181

Conversation

janezd commented Oct 22, 2022 • edited Loading

Issue

Description of changes

Includes

codecov bot commented Dec 23, 2022 • edited Loading

Codecov Report

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023

markotoplak commented Feb 9, 2023

janezd commented Feb 9, 2023 • edited Loading

janezd commented Feb 9, 2023

janezd commented Feb 9, 2023

markotoplak commented Feb 9, 2023 • edited Loading

markotoplak commented Feb 9, 2023

janezd commented Feb 10, 2023

janezd commented Feb 10, 2023

markotoplak commented Feb 17, 2023 • edited Loading

markotoplak commented Mar 24, 2023

janezd commented Mar 26, 2023

janezd commented Mar 30, 2023

janezd commented Oct 22, 2022 •

edited

Loading

codecov bot commented Dec 23, 2022 •

edited

Loading

janezd commented Feb 9, 2023 •

edited

Loading

markotoplak commented Feb 9, 2023 •

edited

Loading

markotoplak commented Feb 17, 2023 •

edited

Loading