Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Continuize: Specific options for variables #6181

Merged
merged 18 commits into from
Mar 30, 2023

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Oct 22, 2022

Issue

Fixes #6108

Requires biolab/orange-widget-base#231.

Description of changes

This design kind of follows Discretize, but I'm not too happy with it.

Try selecting a bunch of variables of different kinds: both groups of radio buttons are enabled, yet clicking in one of them changes only categorical (or numeric) variables from the selection.

Allowing only single selected variable is bad from usability perspective: currently one can use filter to select variables with some common part of the name, select them all and set the option.

Maybe we should have two list views, one with categorical and one with numeric variables? And in case there are no categorical/numeric variables, the corresponding view and radio buttons are hidden? Yes.

Minor point: I reduced the option for class to a single check box. It turns binary class into 0 and 1, and, in general, any class to 0..(n-1). The only alternative option is to use 0..1, but this can be achieved with normalization in a later widget.

Screen Shot 2022-10-22 at 20 19 12

Waiting for design confirmation before completing the below list.

Includes

@janezd janezd force-pushed the continuize-per-variable branch 3 times, most recently from fdb3c6a to b29d6b0 Compare October 24, 2022 18:22
@janezd janezd self-assigned this Dec 23, 2022
@janezd janezd changed the title [RFC] Continuize: Specific options for variables Continuize: Specific options for variables Dec 23, 2022
@janezd janezd marked this pull request as ready for review December 23, 2022 20:21
@janezd janezd removed their assignment Dec 23, 2022
@codecov
Copy link

codecov bot commented Dec 23, 2022

Codecov Report

Merging #6181 (fd78b8b) into master (1bec1ca) will increase coverage by 0.06%.
The diff coverage is 97.42%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6181      +/-   ##
==========================================
+ Coverage   87.44%   87.51%   +0.06%     
==========================================
  Files         317      317              
  Lines       68331    68548     +217     
==========================================
+ Hits        59754    59989     +235     
+ Misses       8577     8559      -18     

@markotoplak
Copy link
Member

Minor point: I reduced the option for class to a single check box. It turns binary class into 0 and 1, and, in general, any class to 0..(n-1). The only alternative option is to use 0..1, but this can be achieved with normalization in a later widget.

This PR also removes the option that can create one feature per value. Perhaps because no widgets could make use of resulting multiple classes? We can still do the same action by moving the class into features and then using Continuize on that feature and moving it back - so the omission is not a big deal.

Like in Discretize, we can specify all options for metas, but Discretize also allows all options on classes. Why not here too? Preventing users to do stupid things? The separate (single) target option is even more confusing to me.

I also saw that there is a blank line between attributes and metas (which does not exist in Discretize). I like that there is some visual separation, but the first time I saw it I thought it was a bug.

I think we should:

  • Put class var in the corresponding list of variables and remove the separate option below.
  • Remove the blank line between attributes and metas. Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

If you disagree, I can also merge this PR as-is.

@janezd
Copy link
Contributor Author

janezd commented Feb 9, 2023

Put class var in the corresponding list of variables and remove the separate option below.

If we do this, the default continuization method will also apply to class. Instead of preventing the users from doing something stupid, we'd do it ourselves, but behind their backs. :)

I think this wouldn't be OK. If you dislike a separate checkbox, I'd prefer removing it altogether. What do you say?

Remove the blank line between attributes and metas.

I haven't noticed it. This will happen always when DomainModel is used in list view, because list views do not properly paint separators. (Even for combos, Qt uses a weird hack - the model has to pass a literal string "separator" as Qt.AccessibleDescriptionRole, and painter will show a line.)

We can fix that in general. Could you, for a start just derive owcontinuize.ListViewSearch.Delegate from orangewidget.utils.combobox._ComboBoxListDelegate: do you like what it looks like?

I can of course also remove the separator from the model.

Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

I did not understand this. Where should we write about metas and classes?

I was thinking, though, that list views and combo boxes that use domain models could have a delegate that would print, in smaller letters "Attributes", "Meta Attributes", "Targets" above separators. Even more, we could allow writing arbitrary text above separators. Would you find this useful?

@markotoplak
Copy link
Member

Put class var in the corresponding list of variables and remove the separate option below.

If we do this, the default continuization method will also apply to class. Instead of preventing the users from doing something stupid, we'd do it ourselves, but behind their backs. :)

I think this wouldn't be OK. If you dislike a separate checkbox, I'd prefer removing it altogether. What do you say?

I think we need to control whether to Continuize the class var or not, so if we do not mix it into the features, we need to keep the checkbox.

But doesn't Discretize, where the class goes into the features, suffer from a similar problem then? A possible solution would be not to automatically apply default options to the classes. They are special enough so that a user could consider them separately.

Remove the blank line between attributes and metas.

I haven't noticed it. This will happen always when DomainModel is used in list view, because list views do not properly paint separators.

Oh, I understand, it is OK. I compared this widget and Discretize and Discretize did not have it. With Discretize I have a related problem: you can not see which are features, which are metas, and which are classes.

Instead, we should add explicit variable type for metas and classes, perhaps in parentheses, so it would look like type (target): chosen continuization option.

I did not understand this. Where should we write about metas and classes?

I though about just adding a string - either "target" or "meta" - after each variable name in the list view. This PR adds continuization type after the name anyway so it is likely a cheap change.

I was thinking, though, that list views and combo boxes that use domain models could have a delegate that would print, in smaller letters "Attributes", "Meta Attributes", "Targets" above separators. Even more, we could allow writing arbitrary text above separators. Would you find this useful?

That would surely be an improvement of separators in this context. I don't particularly like them; I'd rather see the type for each variable separately (what is there are more metas that fit in the current view?). If space is an issue, perhaps a text "META" or "TARGET" written over the feature type icon, diagonally, so that it does not take any more space? (I understand the proposal is probably ugly and too confusing, so just an idea.)

Oh, long ago we discussed changing feature type icons but then just forgot about it. Now they are so confusing (C for Categorical or Continuous?, N for Numeric or Nominal?) that they do not mean anything. Even colored squares would convey the same meaning. Then we could even easily write META or TARGET onto the squares too. ;)

@janezd
Copy link
Contributor Author

janezd commented Feb 9, 2023

I think we need to control whether to Continuize the class var or not, so if we do not mix it into the features, we need to keep the checkbox.

Then I'd prefer keeping the checkbox. If you find it confusing, we can rephrase it.

But doesn't Discretize, where the class goes into the features, suffer from a similar problem then? A possible solution would be not to automatically apply default options to the classes. They are special enough so that a user could consider them separately.

But then the default would apply to all in the list except for one...

I though about just adding a string - either "target" or "meta" - after each variable name in the list view.

This would indeed be cheap, but it would add a lot of visual noise at little value. Why do you care at all whether a variable is an attribute or a meta - in the context of this widget?

Attributes and metas are grouped (unless we would sort the variables, which we never do - and I think we shouldn't), which is why adding a label before the group is imho better than repeating the same label after every name in the first part, and another in the second part. For icons, it would be the same - one type in the first half, another in the second. Furthermore, most data would have just one single type of icon. Plus, I like this icons and find them useful. (I learned that red is numeric and green is categorical, actually.)

@markotoplak
Copy link
Member

Plus, I like this icons and find them useful. (I learned that red is numeric and green is categorical, actually.)

Exactly what I was trying to point out about them. I'd absolutely keep the colors but the current lettering (C or N) only makes them confusing if you read them wrongly.

@markotoplak
Copy link
Member

But then the default would apply to all in the list except for one...

Well, yes, almost. Now that you pointed it out, I do not think applying the default choice to metas makes any sense either.

Why do you care at all whether a variable is an attribute or a meta - in the context of this widget?

In the context of this widget I care a lot. I would not normally want to continuize metas or classes unless I'd explicitly decide for it. And even then, only certain metas would be continuized. So clearly showing what operation goes where would be a priority for me.

Therefore, metas and classes seem approximately equal to me in the context of this widget: none should be attacked by default (or even connected with default) and both could have the same interface. Separating between them in terms of how can I modify them seems very artificial. Showing what I am dealing with, a necessity.

I agree that added strings are additional visual noise, but are clearer. Better separators could be a possible solution with other compromises.

@janezd
Copy link
Contributor Author

janezd commented Feb 9, 2023

Yes, but I still think that using having a long list of A's in almost every list view would look stranger than a mixture of C's and N's. Also because I mostly don't care.

What about this?

Screenshot 2023-02-09 at 19 44 19

Screenshot 2023-02-09 at 19 52 31

Implementation is simple and widgets would mostly just need to pass an additional flag to DomainModel (or specify a different order) to indicate they want this. But I would only use it where relevant, though. I'll be on zoom tomorrow, but you can explain why do you care about distinguishing attributes from metas so much. To me, it's mostly visual clutter.

@janezd
Copy link
Contributor Author

janezd commented Feb 9, 2023

N.B. After seeing the last screenshot, in which I resized the widget, I added stretch below radio buttons, so they don't spread. :)

@janezd
Copy link
Contributor Author

janezd commented Feb 9, 2023

My comment (Yes, but...) did not refer to your last comment ( #6181 (comment)). Now I understand why you care about metas here. If so, we should have separate defaults for attributes and for metas? Doable, though I'd hate doing it. Or, as you said, we can make it clear that default applies only to attributes, by renaming it to "Default for ordinary attributes".

@markotoplak
Copy link
Member

markotoplak commented Feb 9, 2023

Now I understand why you care about metas here. If so, we should have separate defaults for attributes and for metas? Doable, though I'd hate doing it.

I agree, that would not work well. I saw firsthand that even this default selection was already confusing to some, imagine if we had add another.

Or, as you said, we can make it clear that default applies only to attributes, by renaming it to "Default for ordinary attributes".

As it is now, any kind of continuization is possible so I am only guessing what I think would be the least confusing. I guess applying operations default operations to metas is confusing. Therefore, the "Default for ordinary attributes" suggestion makes the most sense to me.

And if we decide to go that way, classes could naturally come into attribute lists, which solves my second source of confusion. :)

But these are only my guesses.

@markotoplak
Copy link
Member

What about this?

Implementation is simple and widgets would mostly just need to pass an additional flag to DomainModel (or specify a different order) to indicate they want this. But I would only use it where relevant, though.

I like it and agree we should only use it where the distinction is informative.

@janezd
Copy link
Contributor Author

janezd commented Feb 10, 2023

@markotoplak, I'm awaiting your comments before fixing and adding tests.

@janezd
Copy link
Contributor Author

janezd commented Feb 10, 2023

You may also try it with and without c10d5e7. With this commit, labels do not flash it the user merely passes over the widget with a mouse.

We may also want to disable tooltips in this case.

@janezd janezd force-pushed the continuize-per-variable branch 3 times, most recently from 3592756 to 3834f8e Compare February 12, 2023 21:45
@markotoplak
Copy link
Member

markotoplak commented Feb 17, 2023

I like the tooltip shown as a better-annotated list. The tooltip delay looks good.

The current version also applies the chosen preset to the metas by default. I like that the Targets are treated differently, and I would prefer the same treatment for metas. Because I know you already had a version doing so I imagine you have a good reason to go back to previous behaviour.

If metas get the preset applied as a default behaviour, we need to migrate settings carefully, because the previous version of the widget just ignored the metas and passed them through as they were. (I tried testing migration just to confirm what happens in master - I fully expected this not to work the same - but then I even got an error when opening the workflow.)

If separate behaviour for attributes and metas is too problematic despite the new tooltip-like descriptions (I think is is fine - if it works for classes...), I would prefer that this widget completely ignores metas rather than continuizing them as set in the preset.

@janezd janezd force-pushed the continuize-per-variable branch 5 times, most recently from 625b9cf to 27df8e7 Compare March 23, 2023 22:32
@markotoplak
Copy link
Member

@janezd, I added two commits, the first fixing some messages and the second fixing migration.

Migration was wrong, and some tests that actually opened old workflows were failing. The problem was that the old settings described some index into a list that did not follow the order from Continuizer.MultinomialTreatment. Therefore I added lists to map old indices to new ones.

Please double-check if what I did makes sense. If you agree, just merge.

@janezd
Copy link
Contributor Author

janezd commented Mar 26, 2023

Thanks, @markotoplak.

I would merge this after we release Slovenian Orange (tomorrow, hopefully); not so much because of possible bugs but because it could requires new translations. And possibly also fixes in tests, like those is #6373 or biolab/orange-translations#60.

@janezd
Copy link
Contributor Author

janezd commented Mar 30, 2023

@markotoplak, I rebased to resolve conflicts, but forgot to fetch your commits before doing that. Can you please reapply them?

Interestingly, tests on old workflows failed because of this(?), but only on PyQt6?!

@markotoplak markotoplak changed the title Continuize: Specific options for variables [ENH] Continuize: Specific options for variables Mar 30, 2023
@markotoplak markotoplak merged commit 45a1eec into biolab:master Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Continuize widget: allow selection and different treatment for individual variables (just like Discretize)
2 participants