Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Trees: Fix classification from sparse data #2496

Merged
merged 1 commit into from
Jul 26, 2017

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Jul 26, 2017

Issue

Tree prediction for sparse data does not work.

Description of changes

Added a corresponding function in Cython.

Includes
  • Code changes
  • Tests

@nikicc nikicc added this to the 3.4.5 milestone Jul 26, 2017
@ajdapretnar
Copy link
Contributor

ajdapretnar commented Jul 26, 2017

Works. 🌈 🥇 💫

@janezd janezd changed the title Trees: Add classification from sparse data [FIX] Trees: Add classification from sparse data Jul 26, 2017
@janezd janezd changed the title [FIX] Trees: Add classification from sparse data [FIX] Trees: Fix classification from sparse data Jul 26, 2017
Copy link
Contributor

@nikicc nikicc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. @janezd, thanks for a prompt fixup!

@janezd
Copy link
Contributor Author

janezd commented Jul 26, 2017

Oops, this expects csr. @nikicc, it would be trivial to add a function for csc, too. Or do I just call to_csr?

@nikicc
Copy link
Contributor

nikicc commented Jul 26, 2017

@janezd I would just call to_csr. Do you have any idea what is going on with the Mosaic test on AppVeyor?

@codecov-io
Copy link

codecov-io commented Jul 26, 2017

Codecov Report

Merging #2496 into master will increase coverage by 0.05%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2496      +/-   ##
==========================================
+ Coverage   74.53%   74.59%   +0.05%     
==========================================
  Files         321      321              
  Lines       56134    56141       +7     
==========================================
+ Hits        41838    41876      +38     
+ Misses      14296    14265      -31

@janezd
Copy link
Contributor Author

janezd commented Jul 26, 2017

I added a separate function for csr; it's almost the same, just with different indexing.

I've no idea about Mosaic. Let's see if it fails again.

@kernc
Copy link
Contributor

kernc commented Jul 26, 2017

It's tremendous satisfaction working in a team that agrees on what to optimize for.

>>> import scipy.sparse
>>> csr = scipy.sparse.random(1e5, 1e4, format='csr')
>>> %timeit csr.tocsc()
417 ms ± 2.65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@janezd
Copy link
Contributor Author

janezd commented Jul 26, 2017

Half a second. If the code that I added is executed ten (OK, twenty) times (on the data of this size), the time saved by adding the code will roughly equal the time I spent by adding it.

@kernc
Copy link
Contributor

kernc commented Jul 26, 2017

Thanks, that's exactly what I meant. 😃

@nikicc
Copy link
Contributor

nikicc commented Jul 26, 2017

Related to #2370.

@nikicc nikicc merged commit 9137a30 into biolab:master Jul 26, 2017
@janezd janezd deleted the sparse-tree-reg branch April 5, 2019 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants