[onert] Fix loss value difference #13736

ragmani · 2024-08-21T11:01:52Z

What

Let's fix loss values difference from tensorflow in some cases

Why

We found out that loss values were different from tensorflow when using the two combinations(Adam optimizer and categorical cross entropy) in branching models such as the model below:

Since MobileNet v2 is also branched, the model's loss values are different.

From @jyoungyun

CCE, Adam(0.001)

onert

$ /usr/bin/time -v ./Product/out/bin/onert_train --epoch 5 --loss 2 --loss_reduction_type 1 --opt
imizer 2 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_input:raw ../Personal/models/data/imagenet_a/test.
input.100.bin --load_expected:raw ../Personal/models/data/imagenet_a/test.output.100.bin ../Personal/models/mobilenet_v2/keras_mod
el/random_init.circle 
Model Filename ../Personal/models/mobilenet_v2/keras_model/random_init.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 490.820ms/step - loss: [0] 6.8929
Epoch 2/5 - time: 498.238ms/step - loss: [0] 5.9610
Epoch 3/5 - time: 484.312ms/step - loss: [0] 4.2990
Epoch 4/5 - time: 491.503ms/step - loss: [0] 4.1537
Epoch 5/5 - time: 483.405ms/step - loss: [0] 4.1316
===================================
MODEL_LOAD   takes 6.4010 ms
PREPARE      takes 227.8460 ms
EXECUTE      takes 24520.4730 ms
- Epoch 1      takes 4908.1980 ms
- Epoch 2      takes 4982.3840 ms
- Epoch 3      takes 4843.1160 ms
- Epoch 4      takes 4915.0270 ms
- Epoch 5      takes 4834.0500 ms
===================================
        Command being timed: "./Product/out/bin/onert_train --epoch 5 --loss 2 --loss_reduction_type 1 --optimizer 2 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_input:raw ../Personal/models/data/imagenet_a/test.input.100.bin --load_expected:raw ../Personal/models/data/imagenet_a/test.output.100.bin ../Personal/models/mobilenet_v2/keras_model/random_init.circle"
        User time (seconds): 315.86
        System time (seconds): 3.40
        Percent of CPU this job got: 1287%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:24.79
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 1038048
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 17
        Minor (reclaiming a frame) page faults: 838227
        Voluntary context switches: 449017
        Involuntary context switches: 12775
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

tensorflow

$ /usr/bin/time -v python3 tensorflow_run.py --loss cce --loss_reduction_type sum_over_batch_size --optimizer adam --learning_rate 0.001 -i ../data/imagenet_a/test.input.100.bin -l ../data/imagenet_a/test.output.100.bin --data_length 100 --epoch 5 --batch_size 10 -m ./keras_model/random_init
2024-08-19 20:42:13.877626: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-19 20:42:13.878617: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-19 20:42:13.898877: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-19 20:42:13.899151: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-19 20:42:14.301366: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Epoch 1/5
10/10 [==============================] - 6s 169ms/step - loss: 6.7111
Epoch 2/5
10/10 [==============================] - 2s 167ms/step - loss: 4.1369
Epoch 3/5
10/10 [==============================] - 2s 167ms/step - loss: 3.2539
Epoch 4/5
10/10 [==============================] - 2s 168ms/step - loss: 2.4737
Epoch 5/5
10/10 [==============================] - 2s 168ms/step - loss: 2.1199
==========================
Avg time: 12.5608
- Test 0   12.5608 s
        Command being timed: "python3 tensorflow_run.py --loss cce --loss_reduction_type sum_over_batch_size --optimizer adam --learning_rate 0.001 -i ../data/imagenet_a/test.input.100.bin -l ../data/imagenet_a/test.output.100.bin --data_length 100 --epoch 5 --batch_size 10 -m ./keras_model/random_init"
        User time (seconds): 117.84
        System time (seconds): 5.17
        Percent of CPU this job got: 766%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.04
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2142188
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 6
        Minor (reclaiming a frame) page faults: 719525
        Voluntary context switches: 644393
        Involuntary context switches: 5671
        Swaps: 0
        File system inputs: 0
        File system outputs: 24
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Required tasks

Apply normalization(softmax) automatically when using CategoricalCrossentropy.
- [cker] Introduce CategoricalCrossEntropyWithLogits #13938
Fix loss value difference when batch_size != 1 and using the reduction type, sum and the loss type, mse.
- [cker] Fix computing MSE Gradient #13944
Fix loss value difference when batch_size != 1 and using the reduction type, sum_over_batch_size and CategoricalCrossentropy.
- [cker] Fix computing CCE Gradient #13969

Draft #13934

The text was updated successfully, but these errors were encountered:

jyoungyun · 2024-08-22T00:01:34Z

The Loss issue occurs in Adam + SGD as well as Adam + CCE.

CCE, Adam (0.001)

ONERT TensorFlow

loss loss: [0] 6.8929
loss: [0] 5.9610
loss: [0] 4.2990
loss: [0] 4.1537
loss: [0] 4.1316 loss: 6.7111
loss: 4.1369
loss: 3.2539
loss: 2.4737
loss: 2.1199

latency 24.393 s 12.469 s

Max RSS 1,038,048 kb 2,142,188 kb

CCE, SGD (0.01)

ONERT TensorFlow

loss loss: [0] 6.9051
loss: [0] 6.8914
loss: [0] 6.8742
loss: [0] 6.8507
loss: [0] 6.8156 loss: 6.9128
loss: 6.5296
loss: 6.1259
loss: 5.7885
loss: 5.5002

latency 22.287 s 10.326 s

Max RSS 1,010,552 kb 2,165,116 kb

ragmani · 2024-08-22T01:58:59Z

The Loss issue occurs in Adam + SGD as well as Adam + CCE.

It's a sad news 😢

ragmani · 2024-09-02T04:26:55Z

Adam + CCE

tensorflow

Epoch 1/5
100/100 [==============================] - 0s 813us/step - loss: 7.7064 - categorical_accuracy: 0.2010
Epoch 2/5
100/100 [==============================] - 0s 728us/step - loss: 8.7290 - categorical_accuracy: 0.2060
Epoch 3/5
100/100 [==============================] - 0s 695us/step - loss: 9.1582 - categorical_accuracy: 0.1720
Epoch 4/5
100/100 [==============================] - 0s 736us/step - loss: 9.2479 - categorical_accuracy: 0.1420
Epoch 5/5
100/100 [==============================] - 0s 727us/step - loss: 9.1316 - categorical_accuracy: 0.1240
==========================
Total time: 0.5966

onert

$ ./Product/x86_64-linux.release/out/bin/onert_train --loss 2 --optimizer 1 --loss_reduction_type 1 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --load_input:raw out/train.input.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = sgd
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 0.488ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0860
Epoch 2/5 - time: 0.460ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0860
Epoch 3/5 - time: 0.527ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0860
Epoch 4/5 - time: 0.437ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0860
Epoch 5/5 - time: 0.452ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0860
===================================
MODEL_LOAD   takes 0.3240 ms
PREPARE      takes 2.0710 ms
EXECUTE      takes 241.4750 ms
- Epoch 1      takes 48.8490 ms
- Epoch 2      takes 46.0450 ms
- Epoch 3      takes 52.7080 ms
- Epoch 4      takes 43.6880 ms
- Epoch 5      takes 45.1500 ms
===================================

Adam + CCE

tensorflow

Epoch 1/5
100/100 [==============================] - 0s 919us/step - loss: 9.4015 - categorical_accuracy: 0.1940
Epoch 2/5
100/100 [==============================] - 0s 866us/step - loss: 9.6065 - categorical_accuracy: 0.2080
Epoch 3/5
100/100 [==============================] - 0s 880us/step - loss: 9.6064 - categorical_accuracy: 0.2090
Epoch 4/5
100/100 [==============================] - 0s 868us/step - loss: 9.6064 - categorical_accuracy: 0.2090
Epoch 5/5
100/100 [==============================] - 0s 844us/step - loss: 9.6064 - categorical_accuracy: 0.2090
==========================
Total time: 0.6848

onert

$ ./Product/x86_64-linux.release/out/bin/onert_train --loss 2 --optimizer 2 --loss_reducti
on_type 1 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --l
oad_input:raw out/train.input.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 0.701ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0920
Epoch 2/5 - time: 0.608ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0920
Epoch 3/5 - time: 0.610ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0920
Epoch 4/5 - time: 0.612ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0920
Epoch 5/5 - time: 0.602ms/step - loss: [0] nan - categorical_accuracy: [0] 0.0920
===================================
MODEL_LOAD   takes 0.8570 ms
PREPARE      takes 4.3150 ms
EXECUTE      takes 320.9320 ms
- Epoch 1      takes 70.1430 ms
- Epoch 2      takes 60.7580 ms
- Epoch 3      takes 60.9560 ms
- Epoch 4      takes 61.2370 ms
- Epoch 5      takes 60.1600 ms
===================================

ragmani · 2024-09-02T04:37:02Z

The model with softmax

SGD + CCE

tensorflow

Epoch 1/5
100/100 [==============================] - 0s 781us/step - loss: 2.3559 - categorical_accuracy: 0.2210
Epoch 2/5
100/100 [==============================] - 0s 714us/step - loss: 1.8817 - categorical_accuracy: 0.3970
Epoch 3/5
100/100 [==============================] - 0s 720us/step - loss: 1.6417 - categorical_accuracy: 0.4920
Epoch 4/5
100/100 [==============================] - 0s 652us/step - loss: 1.4734 - categorical_accuracy: 0.5630
Epoch 5/5
100/100 [==============================] - 0s 611us/step - loss: 1.3445 - categorical_accuracy: 0.6000
==========================
Total time: 0.5747

onert

$ ./Product/x86_64-linux.release/out/bin/onert_train --loss 2 --optimizer 1 --loss_reducti
on_type 1 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --l
oad_input:raw out/train.input.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = sgd
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 0.480ms/step - loss: [0] 1.5273 - categorical_accuracy: [0] 0.1010
Epoch 2/5 - time: 0.480ms/step - loss: [0] 0.9308 - categorical_accuracy: [0] 0.1010
Epoch 3/5 - time: 0.463ms/step - loss: [0] 0.7696 - categorical_accuracy: [0] 0.1010
Epoch 4/5 - time: 0.436ms/step - loss: [0] 0.6822 - categorical_accuracy: [0] 0.1010
Epoch 5/5 - time: 0.451ms/step - loss: [0] 0.6233 - categorical_accuracy: [0] 0.1010
===================================
MODEL_LOAD   takes 0.3980 ms
PREPARE      takes 2.2420 ms
EXECUTE      takes 235.7700 ms
- Epoch 1      takes 47.9630 ms
- Epoch 2      takes 48.0220 ms
- Epoch 3      takes 46.3070 ms
- Epoch 4      takes 43.6420 ms
- Epoch 5      takes 45.0810 ms
===================================

Adam + CCE

tensorflow

Epoch 1/5
100/100 [==============================] - 0s 901us/step - loss: 1.0923 - categorical_accuracy: 0.5950
Epoch 2/5
100/100 [==============================] - 0s 831us/step - loss: 0.6341 - categorical_accuracy: 0.7810
Epoch 3/5
100/100 [==============================] - 0s 823us/step - loss: 0.5396 - categorical_accuracy: 0.8160
Epoch 4/5
100/100 [==============================] - 0s 812us/step - loss: 0.4943 - categorical_accuracy: 0.8130
Epoch 5/5
100/100 [==============================] - 0s 859us/step - loss: 0.4609 - categorical_accuracy: 0.8230
==========================
Total time: 0.6680

onert

$ ./Product/x86_64-linux.release/out/bin/onert_train --loss 2 --optimizer 2 --loss_reducti
on_type 1 --learning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --l
oad_input:raw out/train.input.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 0.627ms/step - loss: [0] 1.0886 - categorical_accuracy: [0] 0.0850
Epoch 2/5 - time: 0.612ms/step - loss: [0] 0.6423 - categorical_accuracy: [0] 0.0850
Epoch 3/5 - time: 0.618ms/step - loss: [0] 0.5578 - categorical_accuracy: [0] 0.0850
Epoch 4/5 - time: 0.605ms/step - loss: [0] 0.5004 - categorical_accuracy: [0] 0.0850
Epoch 5/5 - time: 0.601ms/step - loss: [0] 0.4618 - categorical_accuracy: [0] 0.0850
===================================
MODEL_LOAD   takes 0.3410 ms
PREPARE      takes 2.4010 ms
EXECUTE      takes 312.7880 ms
- Epoch 1      takes 62.7390 ms
- Epoch 2      takes 61.1770 ms
- Epoch 3      takes 61.8160 ms
- Epoch 4      takes 60.4690 ms
- Epoch 5      takes 60.0820 ms
===================================

ragmani · 2024-09-05T03:57:26Z

In the draft, the model without softmax works well by introducing the kernel containing running softmax.
Also, loss values are the same when batch_size==1

Epoch 1/5
1000/1000 [==============================] - 1s 516us/step - loss: 1.0897 - categorical_accuracy: 0.6000
Epoch 2/5
1000/1000 [==============================] - 1s 503us/step - loss: 0.6761 - categorical_accuracy: 0.7450
Epoch 3/5
1000/1000 [==============================] - 1s 500us/step - loss: 0.5629 - categorical_accuracy: 0.7920
Epoch 4/5
1000/1000 [==============================] - 1s 503us/step - loss: 0.4945 - categorical_accuracy: 0.8270
Epoch 5/5
1000/1000 [==============================] - 1s 504us/step - loss: 0.4447 - categorical_accuracy: 0.8380
==========================
Total time: 2.7670

$ ./Product/x86_64-linux.release/out/bin/onert_train --loss 2 --optimizer 2 --loss_reduction_type 1 --learning_rate 0.001 --batch_size 1 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --load_input:raw out/train.input.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 1
- loss_info            = {loss = categorical crossentropy, reduction = sum over batch size}
- optimizer            = adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 0.047ms/step - loss: [0] 1.0897 - categorical_accuracy: [0] 0.0950
Epoch 2/5 - time: 0.044ms/step - loss: [0] 0.6761 - categorical_accuracy: [0] 0.0950
Epoch 3/5 - time: 0.044ms/step - loss: [0] 0.5628 - categorical_accuracy: [0] 0.0950
Epoch 4/5 - time: 0.044ms/step - loss: [0] 0.4945 - categorical_accuracy: [0] 0.0950
Epoch 5/5 - time: 0.050ms/step - loss: [0] 0.4447 - categorical_accuracy: [0] 0.0950
===================================
MODEL_LOAD   takes 0.1380 ms
PREPARE      takes 1.4940 ms
EXECUTE      takes 257.7260 ms
- Epoch 1      takes 46.8360 ms
- Epoch 2      takes 43.6010 ms
- Epoch 3      takes 44.4730 ms
- Epoch 4      takes 44.1270 ms
- Epoch 5      takes 49.8270 ms
===================================

ragmani · 2024-09-06T05:00:50Z

Applying #13944, loss values are the same as tensorflow results when using mse.

The reduction type sum over batch size

Epoch 1/5
100/100 [==============================] - 0s 545us/step - loss: 0.2282 - categorical_accuracy: 0.1270
Epoch 2/5
100/100 [==============================] - 0s 598us/step - loss: 0.1866 - categorical_accuracy: 0.1560
Epoch 3/5
100/100 [==============================] - 0s 552us/step - loss: 0.1752 - categorical_accuracy: 0.1720
Epoch 4/5
100/100 [==============================] - 0s 478us/step - loss: 0.1663 - categorical_accuracy: 0.1850
Epoch 5/5
100/100 [==============================] - 0s 483us/step - loss: 0.1590 - categorical_accuracy: 0.1910
==========================
Total time: 0.5044

$ ./Product/x86_64-linux.debug/out/bin/onert_train --loss 1 --optimizer 1 --loss_reduction_type 1 --learn
ing_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --load_input:raw out/train.input.10
00.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = mean squared error, reduction = sum over batch size}
- optimizer            = sgd
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 1.354ms/step - loss: [0] 0.2282 - categorical_accuracy: [0] 0.1040
Epoch 2/5 - time: 1.242ms/step - loss: [0] 0.1866 - categorical_accuracy: [0] 0.1040
Epoch 3/5 - time: 1.292ms/step - loss: [0] 0.1752 - categorical_accuracy: [0] 0.1040
Epoch 4/5 - time: 1.295ms/step - loss: [0] 0.1663 - categorical_accuracy: [0] 0.1040
Epoch 5/5 - time: 1.290ms/step - loss: [0] 0.1590 - categorical_accuracy: [0] 0.1040
===================================
MODEL_LOAD   takes 1.4580 ms
PREPARE      takes 18.8870 ms
EXECUTE      takes 661.0470 ms
- Epoch 1      takes 135.3970 ms
- Epoch 2      takes 124.2470 ms
- Epoch 3      takes 129.2360 ms
- Epoch 4      takes 129.4980 ms
- Epoch 5      takes 128.9610 ms
===================================

The reduction type sum

Epoch 1/5
100/100 [==============================] - 0s 653us/step - loss: 1.7125 - categorical_accuracy: 0.3630
Epoch 2/5
100/100 [==============================] - 0s 604us/step - loss: 1.0092 - categorical_accuracy: 0.5570
Epoch 3/5
100/100 [==============================] - 0s 586us/step - loss: 0.8466 - categorical_accuracy: 0.6260
Epoch 4/5
100/100 [==============================] - 0s 579us/step - loss: 0.7470 - categorical_accuracy: 0.6750
Epoch 5/5
100/100 [==============================] - 0s 599us/step - loss: 0.6754 - categorical_accuracy: 0.7040
==========================
Total time: 0.5538

$ ./Product/x86_64-linux.debug/out/bin/onert_train --loss 1 --optimizer 2 --loss_reduction_type 2 --lea
rning_rate 0.001 --batch_size 10 --num_of_trainable_ops -1 --load_expected:raw out/train.output.1000.bin --load_input:raw out/train.inpu
t.1000.bin --metric 0 model.circle 
Model Filename model.circle
== training parameter ==
- learning_rate        = 0.001
- batch_size           = 10
- loss_info            = {loss = mean squared error, reduction = sum}
- optimizer            = adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - time: 1.519ms/step - loss: [0] 1.7125 - categorical_accuracy: [0] 0.0890
Epoch 2/5 - time: 1.500ms/step - loss: [0] 1.0092 - categorical_accuracy: [0] 0.0890
Epoch 3/5 - time: 2.238ms/step - loss: [0] 0.8466 - categorical_accuracy: [0] 0.0890
Epoch 4/5 - time: 1.749ms/step - loss: [0] 0.7470 - categorical_accuracy: [0] 0.0890
Epoch 5/5 - time: 1.540ms/step - loss: [0] 0.6754 - categorical_accuracy: [0] 0.0890
===================================
MODEL_LOAD   takes 1.5080 ms
PREPARE      takes 15.0090 ms
EXECUTE      takes 875.6840 ms
- Epoch 1      takes 151.8760 ms
- Epoch 2      takes 150.0400 ms
- Epoch 3      takes 223.8330 ms
- Epoch 4      takes 174.8910 ms
- Epoch 5      takes 154.0180 ms
===================================

ragmani · 2024-09-11T06:34:26Z

To apply normalization(softmax) automatically to categorical cross entropy, we need to consider that sum of labels is not 1.
That consideration will be deal with in another issue later. So, I'm closing this issue since I have completed all other required tasks.

ragmani mentioned this issue Sep 5, 2024

[cker] Introduce CategoricalCrossEntropyWithLogits #13938

Merged

ragmani changed the title ~~[onert] Loss value difference when using Adam + CCE~~ [onert] Fix loss value difference Sep 5, 2024

ragmani mentioned this issue Sep 6, 2024

[cker] Fix computing MSE Gradient #13944

Merged

ragmani mentioned this issue Sep 10, 2024

[cker] Fix computing CCE Gradient #13969

Merged

ragmani closed this as completed Sep 11, 2024

ragmani mentioned this issue Sep 13, 2024

[onert] Apply softmax to CategoricalCrossEntropy automatically #14028

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert] Fix loss value difference #13736

[onert] Fix loss value difference #13736

ragmani commented Aug 21, 2024 •

edited

Loading

jyoungyun commented Aug 22, 2024

ragmani commented Aug 22, 2024 •

edited

Loading

ragmani commented Sep 2, 2024 •

edited

Loading

ragmani commented Sep 2, 2024

ragmani commented Sep 5, 2024 •

edited

Loading

ragmani commented Sep 6, 2024 •

edited

Loading

ragmani commented Sep 11, 2024 •

edited

Loading

[onert] Fix loss value difference #13736

[onert] Fix loss value difference #13736

Comments

ragmani commented Aug 21, 2024 • edited Loading

What

Why

Required tasks

jyoungyun commented Aug 22, 2024

ragmani commented Aug 22, 2024 • edited Loading

ragmani commented Sep 2, 2024 • edited Loading

Adam + CCE

Adam + CCE

ragmani commented Sep 2, 2024

SGD + CCE

Adam + CCE

ragmani commented Sep 5, 2024 • edited Loading

ragmani commented Sep 6, 2024 • edited Loading

ragmani commented Sep 11, 2024 • edited Loading

ragmani commented Aug 21, 2024 •

edited

Loading

ragmani commented Aug 22, 2024 •

edited

Loading

ragmani commented Sep 2, 2024 •

edited

Loading

ragmani commented Sep 5, 2024 •

edited

Loading

ragmani commented Sep 6, 2024 •

edited

Loading

ragmani commented Sep 11, 2024 •

edited

Loading