Bugs in Metrics Calculation #36
Answered
by
ramber1836
shun-zheng
asked this question in
BaiduKDDCup2022
Replies: 3 comments 4 replies
-
For the first, yes, you're right. Thanks for pointing that out, this will
be fixed soon.
For the second, the negative 'Patv' values will be filtered out by
raw_data['Patv'] < 0, so the zero 'Patv' values should only be taken into
account.
All The Best,
Xinjiang Lu
…On Fri, May 6, 2022 at 7:19 PM Shun Zheng ***@***.***> wrote:
It seems that the current evaluation code (commit 2de75e8
<2de75e8>)
still has some bugs.
[image: image]
<https://user-images.githubusercontent.com/5029717/167121556-73018798-7902-4dc1-a0cc-45be1e4098ae.png>
The correct code should be
nan_cond = pd.isna(raw_data).any(axis=1)`
raw_data['Patv'] <= 0
—
Reply to this email directly, view it on GitHub
<#36>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEVHY34UJW6Z2GQMXHH2H2TVIT545ANCNFSM5VH3FV3Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
4 replies
-
The current one should be adopted.
All The Best,
Xinjiang Lu
…On Fri, May 6, 2022 at 7:59 PM Shun Zheng ***@***.***> wrote:
BTW, there are two metrics.py files, which one should we use as the
evaluation standard?
—
Reply to this email directly, view it on GitHub
<#36 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEVHY335KT3FVSU4UR4KJJLVIUCSBANCNFSM5VH3FV3Q>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
ramber1836
-
The condition is dedicated to the rows with missing values (not just
missing target values), thus, I prefer to stick to the current version,
i.e. "nan_cond = pd.isna(raw_data).any(axis=1)".
The performance reported in the newly updated report is about 42.xx.
BTW, the performance you reported is promising. That would be great if your
model can keep doing well on the upcoming test data.
All The Best,
Xinjiang Lu
…On Fri, May 6, 2022 at 8:05 PM Shun Zheng ***@***.***> wrote:
Besides, after using the correct code, 'nan_cond =
pd.isna(raw_data).any(axis=1)', the evaluation error can be significantly
smaller than the initial one. In my case, I can got about 4.xx on the test
set (47.xx reported in your specifications).
So, I am curious about whether this nan_cond is proper, maybe
pd.isna(raw_data['Patv']) is a better choice.
Looking forward to your final evaluation metric.
—
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEVHY36XPQKHBQCQF6ELIP3VIUDH7ANCNFSM5VH3FV3Q>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It seems that the current evaluation code (commit 2de75e8) still has some bugs.
The correct code should be
nan_cond = pd.isna(raw_data).any(axis=1)
Besides, the previous evaluation code is also updated but with a different logic
So, which one should we follow as the final evaluation standard?
Beta Was this translation helpful? Give feedback.
All reactions