Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Speed-up slow table_to_frame #5413

Merged
merged 1 commit into from
Apr 30, 2021

Conversation

PrimozGodec
Copy link
Contributor

Issue

table_to_frame is slow on large dataset. On dataset with shape (1M, 12) it took ~20s to transform dataframe to table. The reason for slow error is list comprehension used to find out if there is any nan in numeric column.

Description of changes

List comprehension is now replaced with NumPy functions.
Transformation on the previously mention dataset now takes ~0.6s

Includes
  • Code changes
  • Tests
  • Documentation

@borondics
Copy link
Member

We started using bottleneck in Orange Spectroscopy. Faster than numpy. Maybe it would also make sense to use that here.
Check out Quasars/orange-spectroscopy#531

@codecov
Copy link

codecov bot commented Apr 29, 2021

Codecov Report

Merging #5413 (f92b98e) into master (98af325) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #5413   +/-   ##
=======================================
  Coverage   86.37%   86.37%           
=======================================
  Files         303      303           
  Lines       62155    62154    -1     
=======================================
  Hits        53688    53688           
+ Misses       8467     8466    -1     

@janezd janezd self-assigned this Apr 30, 2021
@janezd janezd merged commit 7d2e9ff into biolab:master Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants