Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Feature Constructor: Evaluate categorical variables to strings #5637

Merged
merged 3 commits into from
Nov 19, 2021

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Oct 8, 2021

Issue

Fixes #5510. That is, not the original issue but @markotoplak's comment that Feature Constructor should be modified to be more user-friendly.

Description of changes

Values of categorical features are now evaluated to strings, not indices.

Indices were not very useful, also because they referred to indices in a list whose order was not necessarily fixed, therefore this was unstable and dangerous. For this reason -- and because this behaviour wasn't documented anyway (see https://orangedatamining.com/widget-catalog/data/featureconstructor/) -- the only effort towards backward compatibility in this PR is to remove .value from expressions. Plus, the change is noted in the tooltip and documentation.

Includes
  • Code changes
  • Tests
  • Documentation

@codecov
Copy link

codecov bot commented Oct 8, 2021

Codecov Report

Merging #5637 (cb3ff14) into master (3634e37) will increase coverage by 0.03%.
The diff coverage is 95.31%.

@@            Coverage Diff             @@
##           master    #5637      +/-   ##
==========================================
+ Coverage   86.07%   86.11%   +0.03%     
==========================================
  Files         315      315              
  Lines       66048    66074      +26     
==========================================
+ Hits        56854    56901      +47     
+ Misses       9194     9173      -21     

@janezd janezd force-pushed the feature-constructor-discrete-strings branch 2 times, most recently from b301dff to 39dad0f Compare October 8, 2021 18:39
@janezd janezd added the needs discussion Core developers need to discuss the issue label Oct 22, 2021
@janezd janezd force-pushed the feature-constructor-discrete-strings branch from 39dad0f to 839e8bb Compare November 13, 2021 21:28
@janezd
Copy link
Contributor Author

janezd commented Nov 13, 2021

@lanzagar, I think you'll like this (if it works :).

An "Upgrade Expression" button now removes .values, while those without .values are replaced with a dictionary and the variable's value use as key. It's not fast, but should work.

@@ -330,6 +339,111 @@ def data(self, index, role=Qt.DisplayRole):
return super().data(index, role)


def freevars(exp, env):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has been moved above the class definition: it is called from migrate_context, which is called from meta class, which is sets up the class. Function below the class definition are not yet defined at that moment.

@janezd janezd removed the needs discussion Core developers need to discuss the issue label Nov 19, 2021
@janezd janezd force-pushed the feature-constructor-discrete-strings branch 2 times, most recently from a1e5733 to 7aa7570 Compare November 19, 2021 13:31
if mo.group(3) == ".value": # uses string; remove `.value`
return "".join(mo.group(1, 2, 4))
# Uses ints: get them by indexing
return "{" + mo.group(1) + \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "{" + mo.group(1) + \
return mo.group(1) + "{" + \

@janezd janezd force-pushed the feature-constructor-discrete-strings branch from 7aa7570 to 1f399a7 Compare November 19, 2021 14:34
@lanzagar lanzagar changed the title Feature Constructor: Evaluate categorical variables to strings [ENH] Feature Constructor: Evaluate categorical variables to strings Nov 19, 2021
@lanzagar lanzagar merged commit dfe1cf0 into biolab:master Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scatter Plot Categorical Misplacement
2 participants