Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Select Rows: Removing Unused Values for Discrete Variables in Sparse Data #2452

Merged
merged 3 commits into from
Jul 21, 2017

Conversation

nikicc
Copy link
Contributor

@nikicc nikicc commented Jul 5, 2017

Issue

When remove unused features is checked in Select Rows and some discrete variable comes from sparse data, Orange crashes with AttributeError: ravel not found error.

The problem is in the remove_unused_values method (Orange/preprocess/remove.py) which cannot handle sparse matrices.

Issue: https://sentry.io/biolab/orange3/issues/269413491/

Description of changes
  • Implement nanunique function. nanunique returns unique values without missing (np.nan) values and works on sparse and dense matrices.
  • Fix remove_unused_values to use nanunique and hance support sparse discrete columns.
Includes
  • Code changes
  • Tests
  • Documentation

@nikicc nikicc added DH2017 bug A bug confirmed by the core team labels Jul 5, 2017
@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch from b68b8c0 to a85dddb Compare July 7, 2017 11:54
@codecov-io
Copy link

codecov-io commented Jul 7, 2017

Codecov Report

Merging #2452 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2452      +/-   ##
==========================================
- Coverage   74.51%   74.49%   -0.02%     
==========================================
  Files         321      321              
  Lines       56056    56055       -1     
==========================================
- Hits        41769    41760       -9     
- Misses      14287    14295       +8

@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch 4 times, most recently from 56aee9e to a2b2e92 Compare July 10, 2017 12:35
@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch from a2b2e92 to 103197d Compare July 18, 2017 14:00
@janezd janezd merged commit ae61acb into biolab:master Jul 21, 2017
@nikicc nikicc deleted the fix-remove-unused-values-sparse branch July 21, 2017 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug confirmed by the core team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants