Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[onert] Apply softmax to CategoricalCrossEntropy automatically #14028

Open
ragmani opened this issue Sep 13, 2024 · 1 comment
Open

[onert] Apply softmax to CategoricalCrossEntropy automatically #14028

ragmani opened this issue Sep 13, 2024 · 1 comment

Comments

@ragmani
Copy link
Contributor

ragmani commented Sep 13, 2024

What

Let's apply softmax(nomalization) automatically when trying to use CategoricalCrossEntropy loss.

Why

There is no guarantee a pre-trained model has a softmax when trying to apply CategoricalCrossEntropy loss to the model.

To do

  • Investigate how softmax exists when a circle model is created from Pytorch and Tensorflow.
    • Pytorch : Circle models don't have softmax. Pytorch does not add softmax to models even when training with CategoricalCrossEntropy.
    • Tensorflow : Circle models may have softmax if users add softmax. Tensorflow trains well with CategoricalCrossEntropy by applying softmax only once even if there is softmax in the model.
  • Apply softmax to CategoricalCrossEntropy automatically

To apply normalization(softmax) automatically to categorical cross entropy, we need to consider that sum of labels is not 1.
That consideration will be deal with in another issue later. So, I'm closing this issue since I have completed all other required tasks.

Originally posted by @ragmani in #13736 (comment)

@ragmani
Copy link
Contributor Author

ragmani commented Sep 25, 2024

4 cases of circle models pre-trained with CategoricalCrossEntropy can be created:

  1. A model with softmax, trained by executing softmax once for each step (on tensorflow).
  2. A model without softmax, trained by executing softmax once for each step(on tensorflow and pytorch).
  3. A model with softmax, trained by executing softmax twice for each step(on pytorch).
  4. A model without softmax, trained without executing softmax(on tensorflow).

Tensorflow does not allow training without softmax when using CategoricalCrossEntropy loss.
Pytorch only have CrossEntroy instead of CategoricalCrossEntropy. CrossEntropy means that cases other than softmax are allowed. So Pytorch only strictly executes CrossEntory loss logic.

I think It's OK to apply softmax automatically when users try to use CategoricalCrossEntropy loss in onert like tensorlofw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant