-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Impute: Allow setting a default value for all numeric and time variables #5102
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5102 +/- ##
=======================================
Coverage 84.77% 84.77%
=======================================
Files 286 286
Lines 60074 60130 +56
=======================================
+ Hits 50927 50977 +50
- Misses 9147 9153 +6 |
a129d16
to
d07b43c
Compare
Try entering a value with a decimal comma. It breaks the widget. |
@@ -124,7 +128,7 @@ def __call__(self): | |||
|
|||
|
|||
class Default(BaseImputeMethod): | |||
name = "Value" | |||
name = "Fixed value" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new class was introduced below with name = "Fixed value". Was this one intentionally changed to the same name too (short names remain different)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's more informative, looks better in the widget.
Orange/widgets/data/owimpute.py
Outdated
hlayout.addWidget(button) | ||
|
||
le = gui.lineEdit( | ||
None, self, "default_numeric", valueType=float, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.default_numeric
is a string by default, but this changes it to a float. This is inconsistent and probably helped with the below bug on line 308 (and others?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially planned to have a float, but then decided for str and forgot to remove this argument.
Orange/widgets/data/owimpute.py
Outdated
@@ -267,6 +303,11 @@ def create_imputer(self, method, *args): | |||
m = AsDefault() | |||
m.method = default | |||
return m | |||
elif method == Method.Default and not args: # global default values | |||
return impute.FixedValueByType( | |||
default_continuous=float(self.default_numeric or np.nan), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.default_numeric
can be 0 which evaluates to False so np.nan is used instead!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Fixed. Thanks.
d07b43c
to
b240aae
Compare
Not imputing anything: it works for me. Which data did you try and how? Could it be that you tried to impute 0 (the bug discovered by @lanzagar) or you used a wrong delimiter (see next point)? Decimal comma: I now changed the widget to use the current locale. Select Rows does the same. But this is inconsistent: I now need to enter Discrete values: they should come from a combo, I guess. Which values would it contain? For, say, heart disease data? Also, this should then be a context setting that would depend upon whether the selected value is present or not. Using a new value for all discrete variables is also problematic. What if this particular value name already exist for some variables? I found this too messy. |
You mean a bug discovered by me and reported by Mr @lanzagar? :D Yes, it was the case of 0. Discrete values: this is why I deleted a comment, because I realized it is too difficult to implement. So, my use case is usually having N discrete variables with values 0, 1. Some values are missing (sometimes there are just 1, no 0). And I want an easy way to impute in Orange, just fill all the missing values with 0. This is why I asked about discrete and this is why I tested with 0. I was told it is difficult to provide a reasonable solution for discrete. |
This data set still breaks impute. Well, I should probably say it breaks Data Table badly and other widgets less badly. Seems like TimeVariable is not formatted properly? |
73e3578
to
52792c2
Compare
52792c2
to
a7abf4f
Compare
a7abf4f
to
8286474
Compare
I believe this to be fixed now. (It worked at some point, but now it was broken for any file. Shame on me - I apparently changed something and haven't tested it again.) It can be merged now, but has to wait for #5117 (fix for xlsx), because it' s based on it. |
I managed to crash something while testing.
But for the life of me I can't reproduce now, so... |
Issue
Fixes #5065.
Includes