Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Box Plot: Add box for missing group values #4292

Merged
merged 2 commits into from
Jan 7, 2020

Conversation

PrimozGodec
Copy link
Contributor

Issue

@lanzagar and I discussed that there should be also a box for missing values in subgropus for countinuous variables as it is done for discrete.

Description of changes

When the selected variable is continuous there is now box with missing values if they are present in the subgroup. It shows the distribution of values that are missing in the subgroups variable.

Includes
  • Code changes
  • Tests
  • Documentation

@codecov
Copy link

codecov bot commented Dec 22, 2019

Codecov Report

Merging #4292 into master will increase coverage by 0.09%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4292      +/-   ##
==========================================
+ Coverage   86.68%   86.77%   +0.09%     
==========================================
  Files         396      396              
  Lines       71510    71549      +39     
==========================================
+ Hits        61990    62090     +100     
+ Misses       9520     9459      -61

@@ -494,12 +494,13 @@ def compute_box_data(self):
self.dist = []
self.conts = contingency.get_contingency(
dataset, attr, self.group_var)
group_var_labels = self.group_var.values + ["Missing value"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking whether we could simplify the user experience by instead printing, say "missing thal" or "missing sepal length". Same for splitting by missing values. This would tell the user more clearly what those missing values are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a great idea. I added the names of the variables to the missing label.

@janezd
Copy link
Contributor

janezd commented Jan 6, 2020

I have one trivial suggestion, which you may feel free to reject or ignore if no other changes are made.

Screenshot 2020-01-06 at 17 05 21

I think writing missing thal in lower case would like nicer here, and even when other values are written in capitals, missing thal in lower case would not stick out.

I would also put the attribute name into (single) quotes.

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented Jan 6, 2020

@janezd I agree with you it looks nicer and also using a single quote more clearly shows that it is not the name of the value. It is modified.

@ajdapretnar ajdapretnar merged commit 08e2edd into biolab:master Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants