Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] NJT with lengths #1021

Merged
merged 7 commits into from
Oct 4, 2024
Merged

[Feature] NJT with lengths #1021

merged 7 commits into from
Oct 4, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 2, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}62$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 53.3800μs 25.0200μs 39.9681 KOps/s 39.4212 KOps/s $\color{#35bf28}+1.39\%$
test_plain_set_stack_nested 66.4550μs 25.2316μs 39.6329 KOps/s 39.0633 KOps/s $\color{#35bf28}+1.46\%$
test_plain_set_nested_inplace 71.5340μs 27.7053μs 36.0941 KOps/s 35.8677 KOps/s $\color{#35bf28}+0.63\%$
test_plain_set_stack_nested_inplace 76.3630μs 27.3828μs 36.5192 KOps/s 35.9755 KOps/s $\color{#35bf28}+1.51\%$
test_items 25.8090μs 4.1907μs 238.6219 KOps/s 242.2509 KOps/s $\color{#d91a1a}-1.50\%$
test_items_nested 0.5462ms 0.3809ms 2.6255 KOps/s 2.6089 KOps/s $\color{#35bf28}+0.63\%$
test_items_nested_locked 0.6870ms 0.3807ms 2.6270 KOps/s 2.6261 KOps/s $\color{#35bf28}+0.04\%$
test_items_nested_leaf 0.1232ms 81.9748μs 12.1989 KOps/s 12.3750 KOps/s $\color{#d91a1a}-1.42\%$
test_items_stack_nested 0.5924ms 0.3865ms 2.5871 KOps/s 2.5848 KOps/s $\color{#35bf28}+0.09\%$
test_items_stack_nested_leaf 0.1189ms 83.1433μs 12.0274 KOps/s 11.9737 KOps/s $\color{#35bf28}+0.45\%$
test_items_stack_nested_locked 0.7364ms 0.3871ms 2.5836 KOps/s 2.6016 KOps/s $\color{#d91a1a}-0.69\%$
test_keys 26.1190μs 3.5762μs 279.6262 KOps/s 288.3073 KOps/s $\color{#d91a1a}-3.01\%$
test_keys_nested 0.2525ms 0.1345ms 7.4346 KOps/s 7.3941 KOps/s $\color{#35bf28}+0.55\%$
test_keys_nested_locked 0.7023ms 0.1395ms 7.1661 KOps/s 7.1053 KOps/s $\color{#35bf28}+0.86\%$
test_keys_nested_leaf 0.2213ms 0.1176ms 8.5000 KOps/s 8.3967 KOps/s $\color{#35bf28}+1.23\%$
test_keys_stack_nested 0.6175ms 0.1364ms 7.3339 KOps/s 7.3978 KOps/s $\color{#d91a1a}-0.86\%$
test_keys_stack_nested_leaf 0.2045ms 0.1174ms 8.5197 KOps/s 8.4471 KOps/s $\color{#35bf28}+0.86\%$
test_keys_stack_nested_locked 0.2699ms 0.1390ms 7.1962 KOps/s 7.0800 KOps/s $\color{#35bf28}+1.64\%$
test_values 5.6806μs 1.0515μs 950.9997 KOps/s 949.3443 KOps/s $\color{#35bf28}+0.17\%$
test_values_nested 0.2954ms 95.4006μs 10.4821 KOps/s 10.7193 KOps/s $\color{#d91a1a}-2.21\%$
test_values_nested_locked 0.1645ms 93.8110μs 10.6597 KOps/s 10.9481 KOps/s $\color{#d91a1a}-2.63\%$
test_values_nested_leaf 0.1451ms 79.4256μs 12.5904 KOps/s 12.1165 KOps/s $\color{#35bf28}+3.91\%$
test_values_stack_nested 0.1716ms 95.5031μs 10.4709 KOps/s 10.7536 KOps/s $\color{#d91a1a}-2.63\%$
test_values_stack_nested_leaf 0.1474ms 80.4712μs 12.4268 KOps/s 12.6380 KOps/s $\color{#d91a1a}-1.67\%$
test_values_stack_nested_locked 0.1838ms 93.5245μs 10.6924 KOps/s 10.8392 KOps/s $\color{#d91a1a}-1.35\%$
test_membership 25.0070μs 0.9702μs 1.0307 MOps/s 1.1572 MOps/s $\textbf{\color{#d91a1a}-10.93\%}$
test_membership_nested 29.3550μs 2.7981μs 357.3891 KOps/s 360.1800 KOps/s $\color{#d91a1a}-0.77\%$
test_membership_nested_leaf 0.1414ms 2.8774μs 347.5415 KOps/s 356.5648 KOps/s $\color{#d91a1a}-2.53\%$
test_membership_stacked_nested 27.7930μs 2.7714μs 360.8296 KOps/s 362.9257 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_stacked_nested_leaf 25.8890μs 2.8242μs 354.0766 KOps/s 359.5121 KOps/s $\color{#d91a1a}-1.51\%$
test_membership_nested_last 28.2030μs 4.1972μs 238.2519 KOps/s 236.1840 KOps/s $\color{#35bf28}+0.88\%$
test_membership_nested_leaf_last 40.8470μs 4.2309μs 236.3575 KOps/s 230.6388 KOps/s $\color{#35bf28}+2.48\%$
test_membership_stacked_nested_last 21.3800μs 4.2352μs 236.1189 KOps/s 238.8099 KOps/s $\color{#d91a1a}-1.13\%$
test_membership_stacked_nested_leaf_last 27.3510μs 4.2675μs 234.3295 KOps/s 236.1292 KOps/s $\color{#d91a1a}-0.76\%$
test_nested_getleaf 0.1437ms 10.6130μs 94.2243 KOps/s 88.5823 KOps/s $\textbf{\color{#35bf28}+6.37\%}$
test_nested_get 34.2440μs 9.9001μs 101.0089 KOps/s 94.5915 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_stacked_getleaf 54.5740μs 10.4278μs 95.8972 KOps/s 90.8843 KOps/s $\textbf{\color{#35bf28}+5.52\%}$
test_stacked_get 50.4050μs 9.9999μs 100.0005 KOps/s 94.8199 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_nested_getitemleaf 58.9600μs 10.9778μs 91.0932 KOps/s 85.9751 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_nested_getitem 52.3080μs 10.2363μs 97.6915 KOps/s 93.2864 KOps/s $\color{#35bf28}+4.72\%$
test_stacked_getitemleaf 43.6820μs 10.8427μs 92.2278 KOps/s 86.8323 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_stacked_getitem 38.0220μs 10.3486μs 96.6315 KOps/s 92.4575 KOps/s $\color{#35bf28}+4.51\%$
test_lock_nested 0.9864ms 0.4998ms 2.0007 KOps/s 1.9279 KOps/s $\color{#35bf28}+3.78\%$
test_lock_stack_nested 0.7085ms 0.4721ms 2.1181 KOps/s 2.0368 KOps/s $\color{#35bf28}+3.99\%$
test_unlock_nested 0.1000s 0.5187ms 1.9280 KOps/s 2.2779 KOps/s $\textbf{\color{#d91a1a}-15.36\%}$
test_unlock_stack_nested 0.7451ms 0.3853ms 2.5953 KOps/s 2.4485 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_flatten_speed 0.1878ms 0.1017ms 9.8354 KOps/s 10.0565 KOps/s $\color{#d91a1a}-2.20\%$
test_unflatten_speed 0.7284ms 0.5068ms 1.9732 KOps/s 1.8968 KOps/s $\color{#35bf28}+4.03\%$
test_common_ops 3.9040ms 1.1559ms 865.1340 Ops/s 849.4647 Ops/s $\color{#35bf28}+1.84\%$
test_creation 18.3650μs 2.0777μs 481.2969 KOps/s 488.6584 KOps/s $\color{#d91a1a}-1.51\%$
test_creation_empty 66.3040μs 20.1563μs 49.6123 KOps/s 51.1045 KOps/s $\color{#d91a1a}-2.92\%$
test_creation_nested_1 83.6350μs 23.2965μs 42.9248 KOps/s 42.9168 KOps/s $\color{#35bf28}+0.02\%$
test_creation_nested_2 1.1630ms 27.6571μs 36.1570 KOps/s 36.8797 KOps/s $\color{#d91a1a}-1.96\%$
test_clone 0.1166ms 16.8798μs 59.2425 KOps/s 55.3604 KOps/s $\textbf{\color{#35bf28}+7.01\%}$
test_getitem[int] 0.8096ms 16.6729μs 59.9777 KOps/s 57.6000 KOps/s $\color{#35bf28}+4.13\%$
test_getitem[slice_int] 0.1920ms 30.0830μs 33.2414 KOps/s 31.4595 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_getitem[range] 0.1705ms 57.7900μs 17.3040 KOps/s 16.5776 KOps/s $\color{#35bf28}+4.38\%$
test_getitem[tuple] 0.1572ms 25.6691μs 38.9573 KOps/s 38.0054 KOps/s $\color{#35bf28}+2.50\%$
test_getitem[list] 0.1741ms 53.0323μs 18.8564 KOps/s 18.0890 KOps/s $\color{#35bf28}+4.24\%$
test_setitem_dim[int] 65.5830μs 31.4647μs 31.7817 KOps/s 30.3288 KOps/s $\color{#35bf28}+4.79\%$
test_setitem_dim[slice_int] 93.9760μs 59.2012μs 16.8916 KOps/s 16.0349 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_setitem_dim[range] 0.1871ms 82.4727μs 12.1252 KOps/s 11.5441 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_setitem_dim[tuple] 90.5500μs 48.0518μs 20.8109 KOps/s 18.8803 KOps/s $\textbf{\color{#35bf28}+10.23\%}$
test_setitem 0.1244ms 30.3768μs 32.9198 KOps/s 30.7936 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_set 0.1213ms 29.8320μs 33.5211 KOps/s 31.6533 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_set_shared 1.2555ms 0.2146ms 4.6588 KOps/s 4.4192 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_update 0.1574ms 38.8254μs 25.7563 KOps/s 24.6900 KOps/s $\color{#35bf28}+4.32\%$
test_update_nested 0.4476ms 50.2305μs 19.9082 KOps/s 19.1125 KOps/s $\color{#35bf28}+4.16\%$
test_update__nested 0.1108ms 37.0921μs 26.9599 KOps/s 24.9905 KOps/s $\textbf{\color{#35bf28}+7.88\%}$
test_set_nested 85.0690μs 32.2682μs 30.9902 KOps/s 28.9165 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_set_nested_new 0.1175ms 37.9677μs 26.3382 KOps/s 25.3750 KOps/s $\color{#35bf28}+3.80\%$
test_select 0.1509ms 54.7559μs 18.2629 KOps/s 17.2343 KOps/s $\textbf{\color{#35bf28}+5.97\%}$
test_select_nested 0.1322ms 59.7417μs 16.7387 KOps/s 15.6524 KOps/s $\textbf{\color{#35bf28}+6.94\%}$
test_exclude_nested 0.1510ms 75.6784μs 13.2138 KOps/s 12.5378 KOps/s $\textbf{\color{#35bf28}+5.39\%}$
test_empty[True] 0.6344ms 0.3525ms 2.8366 KOps/s 2.8030 KOps/s $\color{#35bf28}+1.20\%$
test_empty[False] 9.9737μs 1.2265μs 815.2952 KOps/s 709.5573 KOps/s $\textbf{\color{#35bf28}+14.90\%}$
test_unbind_speed 0.4193ms 0.2963ms 3.3751 KOps/s 3.1660 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_unbind_speed_stack0 0.5027ms 0.2965ms 3.3724 KOps/s 3.1883 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_unbind_speed_stack1 94.3807ms 0.8324ms 1.2014 KOps/s 1.2650 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_split 95.0244ms 2.1625ms 462.4213 Ops/s 439.0356 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_chunk 2.1796ms 1.9507ms 512.6445 Ops/s 435.9829 Ops/s $\textbf{\color{#35bf28}+17.58\%}$
test_creation[device0] 4.2702ms 0.1168ms 8.5639 KOps/s 8.2587 KOps/s $\color{#35bf28}+3.70\%$
test_creation_from_tensor 0.2685ms 0.1146ms 8.7233 KOps/s 8.5394 KOps/s $\color{#35bf28}+2.15\%$
test_add_one[memmap_tensor0] 0.3104ms 6.8681μs 145.6000 KOps/s 132.6507 KOps/s $\textbf{\color{#35bf28}+9.76\%}$
test_contiguous[memmap_tensor0] 21.7910μs 1.9216μs 520.4013 KOps/s 534.6308 KOps/s $\color{#d91a1a}-2.66\%$
test_stack[memmap_tensor0] 51.4060μs 5.3490μs 186.9520 KOps/s 171.4215 KOps/s $\textbf{\color{#35bf28}+9.06\%}$
test_memmaptd_index 0.6385ms 0.4011ms 2.4932 KOps/s 2.3974 KOps/s $\color{#35bf28}+3.99\%$
test_memmaptd_index_astensor 0.8931ms 0.5018ms 1.9927 KOps/s 1.9144 KOps/s $\color{#35bf28}+4.09\%$
test_memmaptd_index_op 1.9263ms 1.0578ms 945.3655 Ops/s 911.1476 Ops/s $\color{#35bf28}+3.76\%$
test_serialize_model 0.2287s 0.1341s 7.4570 Ops/s 8.2624 Ops/s $\textbf{\color{#d91a1a}-9.75\%}$
test_serialize_model_pickle 0.4490s 0.3976s 2.5152 Ops/s 2.5062 Ops/s $\color{#35bf28}+0.36\%$
test_serialize_weights 0.1278s 0.1181s 8.4653 Ops/s 8.7002 Ops/s $\color{#d91a1a}-2.70\%$
test_serialize_weights_returnearly 0.1729s 0.1604s 6.2330 Ops/s 6.3212 Ops/s $\color{#d91a1a}-1.39\%$
test_serialize_weights_pickle 0.5584s 0.4155s 2.4066 Ops/s 1.0879 Ops/s $\textbf{\color{#35bf28}+121.21\%}$
test_serialize_weights_filesystem 0.2345s 0.1565s 6.3890 Ops/s 7.0338 Ops/s $\textbf{\color{#d91a1a}-9.17\%}$
test_serialize_model_filesystem 0.1644s 0.1516s 6.5971 Ops/s 7.0658 Ops/s $\textbf{\color{#d91a1a}-6.63\%}$
test_reshape_pytree 87.2830μs 38.8832μs 25.7181 KOps/s 24.6251 KOps/s $\color{#35bf28}+4.44\%$
test_reshape_td 96.8320μs 45.1612μs 22.1429 KOps/s 20.3950 KOps/s $\textbf{\color{#35bf28}+8.57\%}$
test_view_pytree 92.8740μs 38.7193μs 25.8269 KOps/s 25.0866 KOps/s $\color{#35bf28}+2.95\%$
test_view_td 0.1394ms 50.6637μs 19.7380 KOps/s 18.2163 KOps/s $\textbf{\color{#35bf28}+8.35\%}$
test_unbind_pytree 0.1530ms 35.8123μs 27.9233 KOps/s 27.3047 KOps/s $\color{#35bf28}+2.27\%$
test_unbind_td 0.2914ms 44.5252μs 22.4592 KOps/s 20.7565 KOps/s $\textbf{\color{#35bf28}+8.20\%}$
test_split_pytree 79.0980μs 37.8468μs 26.4223 KOps/s 25.2338 KOps/s $\color{#35bf28}+4.71\%$
test_split_td 0.5052ms 56.9364μs 17.5635 KOps/s 16.5672 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_add_pytree 0.1004ms 43.1381μs 23.1814 KOps/s 20.9568 KOps/s $\textbf{\color{#35bf28}+10.62\%}$
test_add_td 0.2247ms 86.4150μs 11.5721 KOps/s 10.8633 KOps/s $\textbf{\color{#35bf28}+6.52\%}$
test_compile_add_one_nested[tensordict-compile] 0.1205ms 57.9576μs 17.2540 KOps/s 16.7274 KOps/s $\color{#35bf28}+3.15\%$
test_compile_add_one_nested[tensordict-eager] 0.2620ms 0.1919ms 5.2105 KOps/s 4.9869 KOps/s $\color{#35bf28}+4.48\%$
test_compile_add_one_nested[pytree-compile] 0.1147ms 56.9487μs 17.5597 KOps/s 17.5924 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_add_one_nested[pytree-eager] 0.3702ms 0.1383ms 7.2294 KOps/s 6.8930 KOps/s $\color{#35bf28}+4.88\%$
test_compile_copy_nested[tensordict-compile] 56.9870μs 23.8945μs 41.8506 KOps/s 43.7908 KOps/s $\color{#d91a1a}-4.43\%$
test_compile_copy_nested[tensordict-eager] 0.1835ms 73.4641μs 13.6121 KOps/s 12.9596 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_compile_copy_nested[pytree-compile] 0.1324ms 75.3356μs 13.2739 KOps/s 13.0936 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_nested[pytree-eager] 0.1215ms 69.0396μs 14.4844 KOps/s 14.4025 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_flat[tensordict-compile] 0.3690ms 0.1801ms 5.5540 KOps/s 5.4429 KOps/s $\color{#35bf28}+2.04\%$
test_compile_add_one_flat[tensordict-eager] 0.3072ms 0.2339ms 4.2756 KOps/s 4.1708 KOps/s $\color{#35bf28}+2.51\%$
test_compile_add_one_flat[tensorclass-compile] 0.1531ms 46.3904μs 21.5562 KOps/s 20.6102 KOps/s $\color{#35bf28}+4.59\%$
test_compile_add_one_flat[tensorclass-eager] 0.1705ms 74.5722μs 13.4098 KOps/s 12.8939 KOps/s $\color{#35bf28}+4.00\%$
test_compile_add_one_flat[pytree-compile] 0.3636ms 0.1733ms 5.7701 KOps/s 5.6415 KOps/s $\color{#35bf28}+2.28\%$
test_compile_add_one_flat[pytree-eager] 0.3807ms 0.2806ms 3.5643 KOps/s 3.3845 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_compile_add_self_flat[tensordict-eager] 0.5575ms 0.2701ms 3.7024 KOps/s 3.6156 KOps/s $\color{#35bf28}+2.40\%$
test_compile_add_self_flat[tensordict-compile] 0.3656ms 0.1902ms 5.2568 KOps/s 5.3851 KOps/s $\color{#d91a1a}-2.38\%$
test_compile_add_self_flat[tensorclass-eager] 0.1482ms 71.9673μs 13.8952 KOps/s 13.3697 KOps/s $\color{#35bf28}+3.93\%$
test_compile_add_self_flat[tensorclass-compile] 0.1117ms 48.8302μs 20.4791 KOps/s 19.9970 KOps/s $\color{#35bf28}+2.41\%$
test_compile_add_self_flat[pytree-eager] 0.4608ms 0.2329ms 4.2942 KOps/s 4.1754 KOps/s $\color{#35bf28}+2.85\%$
test_compile_add_self_flat[pytree-compile] 0.3997ms 0.1737ms 5.7579 KOps/s 5.6611 KOps/s $\color{#35bf28}+1.71\%$
test_compile_copy_flat[tensordict-compile] 0.2588ms 0.1124ms 8.8932 KOps/s 8.5377 KOps/s $\color{#35bf28}+4.16\%$
test_compile_copy_flat[tensordict-eager] 0.1572ms 77.6944μs 12.8709 KOps/s 12.8352 KOps/s $\color{#35bf28}+0.28\%$
test_compile_copy_flat[pytree-compile] 0.2013ms 77.4166μs 12.9171 KOps/s 12.7981 KOps/s $\color{#35bf28}+0.93\%$
test_compile_copy_flat[pytree-eager] 0.1400ms 70.8953μs 14.1053 KOps/s 14.3171 KOps/s $\color{#d91a1a}-1.48\%$
test_compile_assign_and_add[tensordict-compile] 0.4223ms 0.1928ms 5.1856 KOps/s 5.0580 KOps/s $\color{#35bf28}+2.52\%$
test_compile_assign_and_add[tensordict-eager] 1.9143ms 1.6980ms 588.9114 Ops/s 565.4061 Ops/s $\color{#35bf28}+4.16\%$
test_compile_assign_and_add[pytree-compile] 0.2815ms 0.1919ms 5.2104 KOps/s 5.1506 KOps/s $\color{#35bf28}+1.16\%$
test_compile_assign_and_add[pytree-eager] 2.2536ms 1.0809ms 925.1515 Ops/s 878.1216 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_compile_assign_and_add_stack[compile] 0.7266ms 0.4223ms 2.3678 KOps/s 2.3678 KOps/s $-0.00\%$
test_compile_assign_and_add_stack[eager] 5.9074ms 4.0609ms 246.2482 Ops/s 243.0277 Ops/s $\color{#35bf28}+1.33\%$
test_compile_indexing[tensor-tensordict-compile] 0.1135ms 34.0722μs 29.3495 KOps/s 28.1804 KOps/s $\color{#35bf28}+4.15\%$
test_compile_indexing[tensor-tensordict-eager] 1.4604ms 47.8963μs 20.8785 KOps/s 19.8234 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_compile_indexing[tensor-tensorclass-compile] 79.4790μs 29.7533μs 33.6097 KOps/s 32.4803 KOps/s $\color{#35bf28}+3.48\%$
test_compile_indexing[tensor-tensorclass-eager] 93.8260μs 29.3939μs 34.0206 KOps/s 33.2448 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-pytree-compile] 86.1110μs 29.6962μs 33.6743 KOps/s 32.6041 KOps/s $\color{#35bf28}+3.28\%$
test_compile_indexing[tensor-pytree-eager] 73.7280μs 29.3438μs 34.0788 KOps/s 33.2632 KOps/s $\color{#35bf28}+2.45\%$
test_compile_indexing[slice-tensordict-compile] 0.1576ms 73.9741μs 13.5182 KOps/s 12.8876 KOps/s $\color{#35bf28}+4.89\%$
test_compile_indexing[slice-tensordict-eager] 0.5132ms 26.9670μs 37.0823 KOps/s 34.5049 KOps/s $\textbf{\color{#35bf28}+7.47\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1376ms 68.7079μs 14.5544 KOps/s 14.2884 KOps/s $\color{#35bf28}+1.86\%$
test_compile_indexing[slice-tensorclass-eager] 69.3290μs 23.2875μs 42.9414 KOps/s 41.6290 KOps/s $\color{#35bf28}+3.15\%$
test_compile_indexing[slice-pytree-compile] 0.1379ms 69.2488μs 14.4407 KOps/s 14.3773 KOps/s $\color{#35bf28}+0.44\%$
test_compile_indexing[slice-pytree-eager] 69.3500μs 23.1192μs 43.2541 KOps/s 41.1644 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_indexing[int-tensordict-compile] 0.1512ms 73.8986μs 13.5321 KOps/s 13.4313 KOps/s $\color{#35bf28}+0.75\%$
test_compile_indexing[int-tensordict-eager] 0.8440ms 26.9164μs 37.1521 KOps/s 34.7144 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_compile_indexing[int-tensorclass-compile] 0.1659ms 68.8990μs 14.5140 KOps/s 14.5626 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_indexing[int-tensorclass-eager] 0.2892ms 22.8269μs 43.8080 KOps/s 42.5100 KOps/s $\color{#35bf28}+3.05\%$
test_compile_indexing[int-pytree-compile] 0.1398ms 69.0616μs 14.4798 KOps/s 14.5712 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_indexing[int-pytree-eager] 77.2940μs 22.9215μs 43.6271 KOps/s 41.8979 KOps/s $\color{#35bf28}+4.13\%$
test_mod_add[eager] 0.1152ms 25.4514μs 39.2905 KOps/s 39.0850 KOps/s $\color{#35bf28}+0.53\%$
test_mod_add[compile] 91.6120μs 38.6160μs 25.8960 KOps/s 25.5926 KOps/s $\color{#35bf28}+1.19\%$
test_mod_add[compile-overhead] 98.2140μs 38.8274μs 25.7550 KOps/s 25.4751 KOps/s $\color{#35bf28}+1.10\%$
test_mod_wrap[eager] 0.3474ms 0.2030ms 4.9261 KOps/s 4.6852 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_mod_wrap[compile] 0.4443ms 0.2243ms 4.4584 KOps/s 4.1913 KOps/s $\textbf{\color{#35bf28}+6.37\%}$
test_mod_wrap[compile-overhead] 0.3617ms 0.2224ms 4.4956 KOps/s 4.2126 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_mod_wrap_and_backward[eager] 12.2967ms 10.8835ms 91.8824 Ops/s 88.5985 Ops/s $\color{#35bf28}+3.71\%$
test_mod_wrap_and_backward[compile] 12.3223ms 10.8155ms 92.4599 Ops/s 83.9952 Ops/s $\textbf{\color{#35bf28}+10.08\%}$
test_mod_wrap_and_backward[compile-overhead] 12.4372ms 10.9018ms 91.7280 Ops/s 81.3399 Ops/s $\textbf{\color{#35bf28}+12.77\%}$
test_seq_add[eager] 0.2065ms 90.2388μs 11.0817 KOps/s 10.6327 KOps/s $\color{#35bf28}+4.22\%$
test_seq_add[compile] 0.1345ms 65.0796μs 15.3658 KOps/s 14.9145 KOps/s $\color{#35bf28}+3.03\%$
test_seq_add[compile-overhead] 0.2802ms 63.1141μs 15.8443 KOps/s 15.3937 KOps/s $\color{#35bf28}+2.93\%$
test_seq_wrap[eager] 0.6623ms 0.3837ms 2.6063 KOps/s 2.5095 KOps/s $\color{#35bf28}+3.86\%$
test_seq_wrap[compile] 1.3074ms 0.2632ms 3.7991 KOps/s 3.6272 KOps/s $\color{#35bf28}+4.74\%$
test_seq_wrap[compile-overhead] 1.3440ms 0.2621ms 3.8152 KOps/s 3.6443 KOps/s $\color{#35bf28}+4.69\%$
test_func_call_runtime[False-eager] 0.5995ms 0.4871ms 2.0531 KOps/s 1.8655 KOps/s $\textbf{\color{#35bf28}+10.06\%}$
test_func_call_runtime[False-compile] 0.6339ms 0.4841ms 2.0657 KOps/s 1.9431 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_func_call_runtime[False-compile-overhead] 1.2356ms 0.4913ms 2.0356 KOps/s 1.9485 KOps/s $\color{#35bf28}+4.47\%$
test_func_call_runtime[True-eager] 0.8172ms 0.7070ms 1.4145 KOps/s 1.2986 KOps/s $\textbf{\color{#35bf28}+8.93\%}$
test_func_call_runtime[True-compile] 1.0531ms 0.5011ms 1.9956 KOps/s 1.9153 KOps/s $\color{#35bf28}+4.19\%$
test_func_call_runtime[True-compile-overhead] 0.8414ms 0.5009ms 1.9965 KOps/s 1.9077 KOps/s $\color{#35bf28}+4.65\%$
test_func_call_cm_runtime[False-eager] 0.8321ms 0.4909ms 2.0371 KOps/s 1.8427 KOps/s $\textbf{\color{#35bf28}+10.55\%}$
test_func_call_cm_runtime[False-compile] 0.8921ms 0.4885ms 2.0472 KOps/s 1.9506 KOps/s $\color{#35bf28}+4.95\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9383ms 0.4868ms 2.0541 KOps/s 1.9455 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_func_call_cm_runtime[True-eager] 1.0819ms 0.8616ms 1.1606 KOps/s 1.0967 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_func_call_cm_runtime[True-compile] 0.8381ms 0.7097ms 1.4090 KOps/s 1.3225 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_func_call_cm_runtime[True-compile-overhead] 0.8575ms 0.7079ms 1.4127 KOps/s 1.3218 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_vmap_func_call_cm_runtime[eager] 2.4854ms 1.8641ms 536.4592 Ops/s 512.8263 Ops/s $\color{#35bf28}+4.61\%$
test_vmap_func_call_cm_runtime[compile] 2.7762ms 1.9511ms 512.5311 Ops/s 496.1255 Ops/s $\color{#35bf28}+3.31\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.7719ms 1.9243ms 519.6738 Ops/s 493.2836 Ops/s $\textbf{\color{#35bf28}+5.35\%}$
test_distributed 0.2821ms 0.1238ms 8.0781 KOps/s 7.8254 KOps/s $\color{#35bf28}+3.23\%$
test_tdmodule 0.1189ms 18.6799μs 53.5336 KOps/s 53.0995 KOps/s $\color{#35bf28}+0.82\%$
test_tdmodule_dispatch 65.5530μs 36.9647μs 27.0529 KOps/s 26.3639 KOps/s $\color{#35bf28}+2.61\%$
test_tdseq 49.2420μs 21.5577μs 46.3871 KOps/s 45.1439 KOps/s $\color{#35bf28}+2.75\%$
test_tdseq_dispatch 67.4360μs 43.0471μs 23.2304 KOps/s 22.2179 KOps/s $\color{#35bf28}+4.56\%$
test_instantiation_functorch 1.7668ms 1.5212ms 657.3773 Ops/s 627.0977 Ops/s $\color{#35bf28}+4.83\%$
test_instantiation_td 2.0159ms 1.1599ms 862.1517 Ops/s 842.3335 Ops/s $\color{#35bf28}+2.35\%$
test_exec_functorch 0.4105ms 0.1810ms 5.5244 KOps/s 5.3047 KOps/s $\color{#35bf28}+4.14\%$
test_exec_functional_call 0.3366ms 0.1663ms 6.0137 KOps/s 5.6172 KOps/s $\textbf{\color{#35bf28}+7.06\%}$
test_exec_td 0.3830ms 0.1912ms 5.2307 KOps/s 4.7037 KOps/s $\textbf{\color{#35bf28}+11.20\%}$
test_exec_td_decorator 1.2647ms 0.2256ms 4.4319 KOps/s 4.1882 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_vmap_mlp_speed[True-True] 0.9840ms 0.6740ms 1.4838 KOps/s 1.4200 KOps/s $\color{#35bf28}+4.49\%$
test_vmap_mlp_speed[True-False] 1.5970ms 0.7610ms 1.3140 KOps/s 1.4254 KOps/s $\textbf{\color{#d91a1a}-7.81\%}$
test_vmap_mlp_speed[False-True] 8.6105ms 0.5719ms 1.7486 KOps/s 1.7864 KOps/s $\color{#d91a1a}-2.12\%$
test_vmap_mlp_speed[False-False] 0.9102ms 0.5264ms 1.8998 KOps/s 1.8217 KOps/s $\color{#35bf28}+4.29\%$
test_vmap_mlp_speed_decorator[True-True] 0.9924ms 0.6310ms 1.5848 KOps/s 1.5159 KOps/s $\color{#35bf28}+4.54\%$
test_vmap_mlp_speed_decorator[True-False] 1.1695ms 0.6287ms 1.5906 KOps/s 1.5107 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_vmap_mlp_speed_decorator[False-True] 0.7843ms 0.5159ms 1.9383 KOps/s 1.8443 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7446ms 0.5164ms 1.9364 KOps/s 1.8470 KOps/s $\color{#35bf28}+4.84\%$
test_to_module_speed[True] 2.3256ms 1.4106ms 708.9163 Ops/s 699.1890 Ops/s $\color{#35bf28}+1.39\%$
test_to_module_speed[False] 1.9047ms 1.3624ms 733.9848 Ops/s 714.3055 Ops/s $\color{#35bf28}+2.76\%$
test_tc_init 0.1325ms 48.1454μs 20.7704 KOps/s 20.4392 KOps/s $\color{#35bf28}+1.62\%$
test_tc_init_nested 0.1728ms 95.6428μs 10.4556 KOps/s 10.3447 KOps/s $\color{#35bf28}+1.07\%$
test_tc_first_layer_tensor 39.7950μs 1.4890μs 671.5709 KOps/s 642.6344 KOps/s $\color{#35bf28}+4.50\%$
test_tc_first_layer_nontensor 37.1990μs 4.6538μs 214.8788 KOps/s 214.0217 KOps/s $\color{#35bf28}+0.40\%$
test_tc_second_layer_tensor 23.2830μs 2.7627μs 361.9708 KOps/s 350.8080 KOps/s $\color{#35bf28}+3.18\%$
test_tc_second_layer_nontensor 39.8750μs 5.8090μs 172.1476 KOps/s 164.5517 KOps/s $\color{#35bf28}+4.62\%$
test_unbind 0.5008s 14.0735ms 71.0556 Ops/s 72.5494 Ops/s $\color{#d91a1a}-2.06\%$
test_full_like 10.0488ms 8.5033ms 117.6015 Ops/s 127.4186 Ops/s $\textbf{\color{#d91a1a}-7.70\%}$
test_zeros_like 3.5584ms 2.8977ms 345.1006 Ops/s 340.6211 Ops/s $\color{#35bf28}+1.32\%$
test_ones_like 4.4911ms 3.5654ms 280.4728 Ops/s 285.2310 Ops/s $\color{#d91a1a}-1.67\%$
test_clone 7.2982ms 5.6991ms 175.4668 Ops/s 191.6563 Ops/s $\textbf{\color{#d91a1a}-8.45\%}$
test_squeeze 86.6720μs 12.1880μs 82.0476 KOps/s 78.7862 KOps/s $\color{#35bf28}+4.14\%$
test_unsqueeze 0.1588ms 89.2467μs 11.2049 KOps/s 10.5569 KOps/s $\textbf{\color{#35bf28}+6.14\%}$
test_split 0.3954ms 0.1909ms 5.2384 KOps/s 4.9394 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_permute 0.3872ms 0.2154ms 4.6429 KOps/s 4.4424 KOps/s $\color{#35bf28}+4.51\%$
test_stack 28.6764ms 26.4943ms 37.7440 Ops/s 39.7300 Ops/s $\color{#d91a1a}-5.00\%$
test_cat 36.7037ms 26.1818ms 38.1945 Ops/s 40.1681 Ops/s $\color{#d91a1a}-4.91\%$

Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 94.3450μs 18.1591μs 55.0687 KOps/s 56.5595 KOps/s $\color{#d91a1a}-2.64\%$
test_plain_set_stack_nested 46.4320μs 18.3920μs 54.3716 KOps/s 56.0996 KOps/s $\color{#d91a1a}-3.08\%$
test_plain_set_nested_inplace 59.9230μs 19.4651μs 51.3740 KOps/s 52.5217 KOps/s $\color{#d91a1a}-2.19\%$
test_plain_set_stack_nested_inplace 54.3030μs 19.5858μs 51.0573 KOps/s 52.3912 KOps/s $\color{#d91a1a}-2.55\%$
test_items 25.9510μs 2.8387μs 352.2749 KOps/s 351.5825 KOps/s $\color{#35bf28}+0.20\%$
test_items_nested 0.4176ms 0.3398ms 2.9426 KOps/s 2.9339 KOps/s $\color{#35bf28}+0.30\%$
test_items_nested_locked 0.3750ms 0.3378ms 2.9606 KOps/s 2.9312 KOps/s $\color{#35bf28}+1.00\%$
test_items_nested_leaf 86.0550μs 62.3385μs 16.0415 KOps/s 15.9361 KOps/s $\color{#35bf28}+0.66\%$
test_items_stack_nested 0.4456ms 0.3444ms 2.9033 KOps/s 2.9351 KOps/s $\color{#d91a1a}-1.08\%$
test_items_stack_nested_leaf 98.8750μs 64.4393μs 15.5185 KOps/s 15.4247 KOps/s $\color{#35bf28}+0.61\%$
test_items_stack_nested_locked 0.3696ms 0.3414ms 2.9292 KOps/s 2.9386 KOps/s $\color{#d91a1a}-0.32\%$
test_keys 28.0420μs 3.3976μs 294.3222 KOps/s 292.7498 KOps/s $\color{#35bf28}+0.54\%$
test_keys_nested 95.1550μs 71.3099μs 14.0233 KOps/s 14.0728 KOps/s $\color{#d91a1a}-0.35\%$
test_keys_nested_locked 2.4709ms 76.8099μs 13.0192 KOps/s 12.8970 KOps/s $\color{#35bf28}+0.95\%$
test_keys_nested_leaf 92.7750μs 62.1053μs 16.1017 KOps/s 16.0817 KOps/s $\color{#35bf28}+0.12\%$
test_keys_stack_nested 0.1034ms 71.6412μs 13.9585 KOps/s 14.0114 KOps/s $\color{#d91a1a}-0.38\%$
test_keys_stack_nested_leaf 89.7340μs 63.2744μs 15.8042 KOps/s 15.8235 KOps/s $\color{#d91a1a}-0.12\%$
test_keys_stack_nested_locked 0.1109ms 78.0420μs 12.8136 KOps/s 12.8390 KOps/s $\color{#d91a1a}-0.20\%$
test_values 5.6087μs 0.8608μs 1.1617 MOps/s 1.1919 MOps/s $\color{#d91a1a}-2.53\%$
test_values_nested 76.1340μs 49.0730μs 20.3778 KOps/s 20.5651 KOps/s $\color{#d91a1a}-0.91\%$
test_values_nested_locked 83.7840μs 50.6829μs 19.7305 KOps/s 20.0726 KOps/s $\color{#d91a1a}-1.70\%$
test_values_nested_leaf 74.3840μs 42.3151μs 23.6322 KOps/s 23.4391 KOps/s $\color{#35bf28}+0.82\%$
test_values_stack_nested 95.6450μs 50.1254μs 19.9500 KOps/s 20.2612 KOps/s $\color{#d91a1a}-1.54\%$
test_values_stack_nested_leaf 74.7440μs 44.0773μs 22.6874 KOps/s 23.0130 KOps/s $\color{#d91a1a}-1.41\%$
test_values_stack_nested_locked 80.1040μs 51.4687μs 19.4293 KOps/s 19.5549 KOps/s $\color{#d91a1a}-0.64\%$
test_membership 2.2286μs 0.4984μs 2.0063 MOps/s 1.9756 MOps/s $\color{#35bf28}+1.55\%$
test_membership_nested 40.9820μs 1.9569μs 511.0035 KOps/s 526.3374 KOps/s $\color{#d91a1a}-2.91\%$
test_membership_nested_leaf 17.7210μs 1.8577μs 538.2895 KOps/s 523.1507 KOps/s $\color{#35bf28}+2.89\%$
test_membership_stacked_nested 41.2420μs 1.9477μs 513.4361 KOps/s 508.1804 KOps/s $\color{#35bf28}+1.03\%$
test_membership_stacked_nested_leaf 23.2110μs 1.9548μs 511.5743 KOps/s 509.0857 KOps/s $\color{#35bf28}+0.49\%$
test_membership_nested_last 24.8910μs 2.9795μs 335.6225 KOps/s 337.5107 KOps/s $\color{#d91a1a}-0.56\%$
test_membership_nested_leaf_last 23.2610μs 2.9782μs 335.7756 KOps/s 334.5817 KOps/s $\color{#35bf28}+0.36\%$
test_membership_stacked_nested_last 32.7120μs 3.4994μs 285.7592 KOps/s 327.0336 KOps/s $\textbf{\color{#d91a1a}-12.62\%}$
test_membership_stacked_nested_leaf_last 41.7120μs 3.4773μs 287.5811 KOps/s 332.3641 KOps/s $\textbf{\color{#d91a1a}-13.47\%}$
test_nested_getleaf 28.0210μs 6.1106μs 163.6506 KOps/s 162.9005 KOps/s $\color{#35bf28}+0.46\%$
test_nested_get 29.8920μs 5.7934μs 172.6094 KOps/s 172.1643 KOps/s $\color{#35bf28}+0.26\%$
test_stacked_getleaf 25.9610μs 6.0576μs 165.0821 KOps/s 164.9229 KOps/s $\color{#35bf28}+0.10\%$
test_stacked_get 36.0920μs 5.7910μs 172.6809 KOps/s 174.8111 KOps/s $\color{#d91a1a}-1.22\%$
test_nested_getitemleaf 32.2320μs 6.1508μs 162.5811 KOps/s 162.2485 KOps/s $\color{#35bf28}+0.21\%$
test_nested_getitem 26.2210μs 5.7927μs 172.6304 KOps/s 172.5229 KOps/s $\color{#35bf28}+0.06\%$
test_stacked_getitemleaf 35.6620μs 6.1091μs 163.6907 KOps/s 163.3225 KOps/s $\color{#35bf28}+0.23\%$
test_stacked_getitem 21.7110μs 5.6611μs 176.6445 KOps/s 175.9428 KOps/s $\color{#35bf28}+0.40\%$
test_lock_nested 4.5354ms 0.4401ms 2.2725 KOps/s 2.2632 KOps/s $\color{#35bf28}+0.41\%$
test_lock_stack_nested 0.4460ms 0.3968ms 2.5203 KOps/s 2.5464 KOps/s $\color{#d91a1a}-1.03\%$
test_unlock_nested 0.7782ms 0.3715ms 2.6915 KOps/s 2.6520 KOps/s $\color{#35bf28}+1.49\%$
test_unlock_stack_nested 0.3671ms 0.3328ms 3.0053 KOps/s 3.0223 KOps/s $\color{#d91a1a}-0.56\%$
test_flatten_speed 0.1125ms 76.9215μs 13.0003 KOps/s 13.0492 KOps/s $\color{#d91a1a}-0.37\%$
test_unflatten_speed 0.3834ms 0.3300ms 3.0300 KOps/s 3.0577 KOps/s $\color{#d91a1a}-0.91\%$
test_common_ops 1.6554ms 1.3381ms 747.3095 Ops/s 754.6756 Ops/s $\color{#d91a1a}-0.98\%$
test_creation 21.7910μs 1.4719μs 679.4060 KOps/s 678.8851 KOps/s $\color{#35bf28}+0.08\%$
test_creation_empty 51.4430μs 18.0578μs 55.3776 KOps/s 57.5736 KOps/s $\color{#d91a1a}-3.81\%$
test_creation_nested_1 45.6520μs 20.3602μs 49.1155 KOps/s 51.1718 KOps/s $\color{#d91a1a}-4.02\%$
test_creation_nested_2 83.1340μs 21.9056μs 45.6503 KOps/s 45.4121 KOps/s $\color{#35bf28}+0.52\%$
test_clone 57.8430μs 29.1474μs 34.3083 KOps/s 33.4741 KOps/s $\color{#35bf28}+2.49\%$
test_getitem[int] 1.3189ms 16.0798μs 62.1899 KOps/s 59.5154 KOps/s $\color{#35bf28}+4.49\%$
test_getitem[slice_int] 0.1335ms 28.2006μs 35.4603 KOps/s 34.5089 KOps/s $\color{#35bf28}+2.76\%$
test_getitem[range] 0.1476ms 0.1088ms 9.1894 KOps/s 8.8480 KOps/s $\color{#35bf28}+3.86\%$
test_getitem[tuple] 0.1313ms 24.3680μs 41.0374 KOps/s 41.2852 KOps/s $\color{#d91a1a}-0.60\%$
test_getitem[list] 0.2016ms 0.1000ms 9.9960 KOps/s 9.8716 KOps/s $\color{#35bf28}+1.26\%$
test_setitem_dim[int] 0.1270ms 44.6685μs 22.3871 KOps/s 21.6715 KOps/s $\color{#35bf28}+3.30\%$
test_setitem_dim[slice_int] 91.5440μs 67.8826μs 14.7313 KOps/s 14.5130 KOps/s $\color{#35bf28}+1.50\%$
test_setitem_dim[range] 0.1695ms 0.1282ms 7.7973 KOps/s 7.6476 KOps/s $\color{#35bf28}+1.96\%$
test_setitem_dim[tuple] 86.8240μs 61.2655μs 16.3224 KOps/s 16.1791 KOps/s $\color{#35bf28}+0.89\%$
test_setitem 75.8140μs 44.6410μs 22.4009 KOps/s 22.4391 KOps/s $\color{#d91a1a}-0.17\%$
test_set 0.1095ms 44.0619μs 22.6954 KOps/s 23.0076 KOps/s $\color{#d91a1a}-1.36\%$
test_set_shared 0.3659ms 54.9636μs 18.1939 KOps/s 18.0586 KOps/s $\color{#35bf28}+0.75\%$
test_update 0.1107ms 54.0859μs 18.4891 KOps/s 18.7140 KOps/s $\color{#d91a1a}-1.20\%$
test_update_nested 0.1093ms 60.4428μs 16.5446 KOps/s 16.4544 KOps/s $\color{#35bf28}+0.55\%$
test_update__nested 99.7050μs 60.4492μs 16.5428 KOps/s 15.6716 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_set_nested 86.2940μs 46.4746μs 21.5172 KOps/s 21.8484 KOps/s $\color{#d91a1a}-1.52\%$
test_set_nested_new 88.9850μs 49.5548μs 20.1797 KOps/s 20.1936 KOps/s $\color{#d91a1a}-0.07\%$
test_select 0.5015ms 63.3523μs 15.7847 KOps/s 15.6813 KOps/s $\color{#35bf28}+0.66\%$
test_select_nested 78.7530μs 41.7681μs 23.9417 KOps/s 23.7203 KOps/s $\color{#35bf28}+0.93\%$
test_exclude_nested 99.4050μs 59.4546μs 16.8196 KOps/s 16.8494 KOps/s $\color{#d91a1a}-0.18\%$
test_empty[True] 0.2880ms 0.2580ms 3.8761 KOps/s 3.7804 KOps/s $\color{#35bf28}+2.53\%$
test_empty[False] 3.1731μs 0.7392μs 1.3527 MOps/s 1.3422 MOps/s $\color{#35bf28}+0.79\%$
test_to 54.8830μs 26.4904μs 37.7495 KOps/s 36.9746 KOps/s $\color{#35bf28}+2.10\%$
test_to_nonblocking 60.8430μs 24.9973μs 40.0043 KOps/s 38.5780 KOps/s $\color{#35bf28}+3.70\%$
test_unbind_speed 1.5697ms 0.2899ms 3.4499 KOps/s 3.4533 KOps/s $\color{#d91a1a}-0.10\%$
test_unbind_speed_stack0 0.3464ms 0.2876ms 3.4775 KOps/s 3.5343 KOps/s $\color{#d91a1a}-1.61\%$
test_unbind_speed_stack1 91.9298ms 0.7287ms 1.3723 KOps/s 1.4047 KOps/s $\color{#d91a1a}-2.30\%$
test_split 94.0950ms 2.2010ms 454.3339 Ops/s 434.6510 Ops/s $\color{#35bf28}+4.53\%$
test_chunk 94.3049ms 2.1908ms 456.4471 Ops/s 432.6757 Ops/s $\textbf{\color{#35bf28}+5.49\%}$
test_creation[device0] 0.3298ms 0.1270ms 7.8759 KOps/s 7.7971 KOps/s $\color{#35bf28}+1.01\%$
test_creation_from_tensor 0.3990ms 0.1290ms 7.7521 KOps/s 7.6330 KOps/s $\color{#35bf28}+1.56\%$
test_add_one[memmap_tensor0] 0.2750ms 8.8817μs 112.5910 KOps/s 110.4297 KOps/s $\color{#35bf28}+1.96\%$
test_contiguous[memmap_tensor0] 23.1810μs 2.2307μs 448.2820 KOps/s 448.2500 KOps/s $+0.01\%$
test_stack[memmap_tensor0] 35.6320μs 6.6475μs 150.4324 KOps/s 142.9378 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_memmaptd_index 1.1234ms 0.4281ms 2.3361 KOps/s 2.2466 KOps/s $\color{#35bf28}+3.98\%$
test_memmaptd_index_astensor 0.7708ms 0.5033ms 1.9869 KOps/s 1.9690 KOps/s $\color{#35bf28}+0.91\%$
test_memmaptd_index_op 1.4717ms 1.0747ms 930.5216 Ops/s 921.3406 Ops/s $\color{#35bf28}+1.00\%$
test_serialize_model 0.1312s 0.1309s 7.6410 Ops/s 7.6919 Ops/s $\color{#d91a1a}-0.66\%$
test_serialize_model_pickle 1.3475s 1.2122s 0.8249 Ops/s 0.8206 Ops/s $\color{#35bf28}+0.52\%$
test_serialize_weights 0.2221s 0.1427s 7.0071 Ops/s 7.7429 Ops/s $\textbf{\color{#d91a1a}-9.50\%}$
test_serialize_weights_returnearly 0.2191s 56.0352ms 17.8459 Ops/s 17.5022 Ops/s $\color{#35bf28}+1.96\%$
test_serialize_weights_pickle 1.3730s 1.2167s 0.8219 Ops/s 0.8220 Ops/s $\color{#d91a1a}-0.01\%$
test_reshape_pytree 87.2940μs 35.3559μs 28.2838 KOps/s 27.0274 KOps/s $\color{#35bf28}+4.65\%$
test_reshape_td 90.3240μs 42.2968μs 23.6425 KOps/s 23.6769 KOps/s $\color{#d91a1a}-0.15\%$
test_view_pytree 81.2040μs 36.0725μs 27.7220 KOps/s 27.4229 KOps/s $\color{#35bf28}+1.09\%$
test_view_td 86.5450μs 46.1661μs 21.6609 KOps/s 20.7161 KOps/s $\color{#35bf28}+4.56\%$
test_unbind_pytree 75.9230μs 35.2578μs 28.3625 KOps/s 28.4316 KOps/s $\color{#d91a1a}-0.24\%$
test_unbind_td 0.5239ms 43.9439μs 22.7563 KOps/s 22.8208 KOps/s $\color{#d91a1a}-0.28\%$
test_split_pytree 93.5640μs 46.1924μs 21.6486 KOps/s 22.1953 KOps/s $\color{#d91a1a}-2.46\%$
test_split_td 0.4713ms 57.5467μs 17.3772 KOps/s 17.0933 KOps/s $\color{#35bf28}+1.66\%$
test_add_pytree 0.1454ms 60.0033μs 16.6658 KOps/s 17.7596 KOps/s $\textbf{\color{#d91a1a}-6.16\%}$
test_add_td 0.1440ms 0.1046ms 9.5613 KOps/s 10.5779 KOps/s $\textbf{\color{#d91a1a}-9.61\%}$
test_compile_add_one_nested[tensordict-compile] 0.2328ms 0.1611ms 6.2083 KOps/s 6.1726 KOps/s $\color{#35bf28}+0.58\%$
test_compile_add_one_nested[tensordict-eager] 0.2917ms 0.1654ms 6.0443 KOps/s 6.1328 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_add_one_nested[pytree-compile] 0.2083ms 0.1450ms 6.8973 KOps/s 6.9338 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_add_one_nested[pytree-eager] 0.2381ms 0.1834ms 5.4524 KOps/s 5.2996 KOps/s $\color{#35bf28}+2.88\%$
test_compile_copy_nested[tensordict-compile] 0.1005ms 20.8125μs 48.0480 KOps/s 47.0397 KOps/s $\color{#35bf28}+2.14\%$
test_compile_copy_nested[tensordict-eager] 77.5840μs 49.0104μs 20.4038 KOps/s 19.7530 KOps/s $\color{#35bf28}+3.29\%$
test_compile_copy_nested[pytree-compile] 0.2906ms 64.5426μs 15.4936 KOps/s 15.4745 KOps/s $\color{#35bf28}+0.12\%$
test_compile_copy_nested[pytree-eager] 85.5440μs 49.3214μs 20.2752 KOps/s 20.4643 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_add_one_flat[tensordict-compile] 0.4258ms 0.3234ms 3.0923 KOps/s 3.1197 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_flat[tensordict-eager] 0.3170ms 0.2401ms 4.1647 KOps/s 4.2561 KOps/s $\color{#d91a1a}-2.15\%$
test_compile_add_one_flat[tensorclass-compile] 0.2227ms 0.1286ms 7.7749 KOps/s 7.4455 KOps/s $\color{#35bf28}+4.42\%$
test_compile_add_one_flat[tensorclass-eager] 0.1053ms 66.5697μs 15.0219 KOps/s 14.7062 KOps/s $\color{#35bf28}+2.15\%$
test_compile_add_one_flat[pytree-compile] 0.4711ms 0.3212ms 3.1132 KOps/s 3.0694 KOps/s $\color{#35bf28}+1.43\%$
test_compile_add_one_flat[pytree-eager] 0.7064ms 0.6239ms 1.6029 KOps/s 1.4364 KOps/s $\textbf{\color{#35bf28}+11.59\%}$
test_compile_add_self_flat[tensordict-eager] 0.3684ms 0.2887ms 3.4642 KOps/s 3.5709 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_add_self_flat[tensordict-compile] 0.3789ms 0.3260ms 3.0679 KOps/s 3.1033 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_add_self_flat[tensorclass-eager] 0.1179ms 77.7951μs 12.8543 KOps/s 12.4816 KOps/s $\color{#35bf28}+2.99\%$
test_compile_add_self_flat[tensorclass-compile] 0.1757ms 0.1310ms 7.6320 KOps/s 7.7374 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_add_self_flat[pytree-eager] 0.6653ms 0.5288ms 1.8912 KOps/s 1.6864 KOps/s $\textbf{\color{#35bf28}+12.14\%}$
test_compile_add_self_flat[pytree-compile] 0.3730ms 0.3208ms 3.1173 KOps/s 3.1391 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_copy_flat[tensordict-compile] 57.8230μs 19.3022μs 51.8075 KOps/s 51.3642 KOps/s $\color{#35bf28}+0.86\%$
test_compile_copy_flat[tensordict-eager] 80.5530μs 38.8061μs 25.7691 KOps/s 23.6251 KOps/s $\textbf{\color{#35bf28}+9.08\%}$
test_compile_copy_flat[pytree-compile] 0.1238ms 69.9257μs 14.3009 KOps/s 14.2410 KOps/s $\color{#35bf28}+0.42\%$
test_compile_copy_flat[pytree-eager] 82.6040μs 50.9343μs 19.6332 KOps/s 19.3157 KOps/s $\color{#35bf28}+1.64\%$
test_compile_assign_and_add[tensordict-compile] 2.4274ms 0.8450ms 1.1834 KOps/s 1.1089 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_compile_assign_and_add[tensordict-eager] 3.6479ms 3.2691ms 305.8987 Ops/s 304.1377 Ops/s $\color{#35bf28}+0.58\%$
test_compile_assign_and_add[pytree-compile] 2.3290ms 0.8221ms 1.2165 KOps/s 1.1081 KOps/s $\textbf{\color{#35bf28}+9.78\%}$
test_compile_assign_and_add[pytree-eager] 3.2709ms 3.1567ms 316.7894 Ops/s 303.2997 Ops/s $\color{#35bf28}+4.45\%$
test_compile_indexing[tensor-tensordict-compile] 0.1701ms 0.1092ms 9.1545 KOps/s 9.1488 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[tensor-tensordict-eager] 0.1919ms 60.6901μs 16.4772 KOps/s 15.9317 KOps/s $\color{#35bf28}+3.42\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1504ms 0.1040ms 9.6165 KOps/s 9.6208 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1534ms 43.8246μs 22.8183 KOps/s 22.2996 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-pytree-compile] 0.2033ms 0.1037ms 9.6406 KOps/s 9.5413 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[tensor-pytree-eager] 89.1440μs 43.0613μs 23.2227 KOps/s 22.0491 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_compile_indexing[slice-tensordict-compile] 0.1995ms 0.1381ms 7.2418 KOps/s 7.2553 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_indexing[slice-tensordict-eager] 0.1561ms 24.8283μs 40.2766 KOps/s 37.5601 KOps/s $\textbf{\color{#35bf28}+7.23\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1753ms 0.1311ms 7.6263 KOps/s 7.4995 KOps/s $\color{#35bf28}+1.69\%$
test_compile_indexing[slice-tensorclass-eager] 57.6130μs 20.6637μs 48.3941 KOps/s 46.0420 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_compile_indexing[slice-pytree-compile] 0.1811ms 0.1326ms 7.5394 KOps/s 7.5306 KOps/s $\color{#35bf28}+0.12\%$
test_compile_indexing[slice-pytree-eager] 55.3520μs 20.4458μs 48.9099 KOps/s 46.7199 KOps/s $\color{#35bf28}+4.69\%$
test_compile_indexing[int-tensordict-compile] 0.1916ms 0.1386ms 7.2168 KOps/s 7.1542 KOps/s $\color{#35bf28}+0.88\%$
test_compile_indexing[int-tensordict-eager] 0.5027ms 24.3145μs 41.1277 KOps/s 37.1407 KOps/s $\textbf{\color{#35bf28}+10.73\%}$
test_compile_indexing[int-tensorclass-compile] 0.1854ms 0.1330ms 7.5209 KOps/s 7.5008 KOps/s $\color{#35bf28}+0.27\%$
test_compile_indexing[int-tensorclass-eager] 58.3630μs 28.7004μs 34.8427 KOps/s 46.1948 KOps/s $\textbf{\color{#d91a1a}-24.57\%}$
test_compile_indexing[int-pytree-compile] 0.1746ms 0.1324ms 7.5531 KOps/s 7.5012 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[int-pytree-eager] 57.1820μs 20.5821μs 48.5858 KOps/s 39.1002 KOps/s $\textbf{\color{#35bf28}+24.26\%}$
test_mod_add[eager] 76.6540μs 33.8482μs 29.5436 KOps/s 29.6159 KOps/s $\color{#d91a1a}-0.24\%$
test_mod_add[compile] 0.1055ms 67.7787μs 14.7539 KOps/s 13.8070 KOps/s $\textbf{\color{#35bf28}+6.86\%}$
test_mod_add[compile-overhead] 0.2616ms 0.1361ms 7.3496 KOps/s 6.6274 KOps/s $\textbf{\color{#35bf28}+10.90\%}$
test_mod_wrap[eager] 0.9055ms 0.7946ms 1.2585 KOps/s 1.2611 KOps/s $\color{#d91a1a}-0.20\%$
test_mod_wrap[compile] 1.9298ms 0.8425ms 1.1869 KOps/s 1.1882 KOps/s $\color{#d91a1a}-0.10\%$
test_mod_wrap[compile-overhead] 4.9186ms 3.0563ms 327.1976 Ops/s 324.6231 Ops/s $\color{#35bf28}+0.79\%$
test_mod_wrap_and_backward[eager] 4.1804ms 4.0287ms 248.2216 Ops/s 241.3058 Ops/s $\color{#35bf28}+2.87\%$
test_mod_wrap_and_backward[compile] 4.2944ms 4.0472ms 247.0824 Ops/s 241.1878 Ops/s $\color{#35bf28}+2.44\%$
test_mod_wrap_and_backward[compile-overhead] 1.3870ms 0.9209ms 1.0859 KOps/s 978.8893 Ops/s $\textbf{\color{#35bf28}+10.93\%}$
test_seq_add[eager] 0.1419ms 0.1034ms 9.6708 KOps/s 9.6389 KOps/s $\color{#35bf28}+0.33\%$
test_seq_add[compile] 0.1468ms 79.1160μs 12.6397 KOps/s 12.0675 KOps/s $\color{#35bf28}+4.74\%$
test_seq_add[compile-overhead] 0.1660ms 0.1141ms 8.7627 KOps/s 8.7051 KOps/s $\color{#35bf28}+0.66\%$
test_seq_wrap[eager] 1.1254ms 0.9400ms 1.0639 KOps/s 1.0534 KOps/s $\color{#35bf28}+0.99\%$
test_seq_wrap[compile] 0.9453ms 0.8568ms 1.1672 KOps/s 1.1729 KOps/s $\color{#d91a1a}-0.49\%$
test_seq_wrap[compile-overhead] 0.2715ms 0.2199ms 4.5485 KOps/s 4.5162 KOps/s $\color{#35bf28}+0.72\%$
test_func_call_runtime[False-eager] 2.4327ms 2.3588ms 423.9449 Ops/s 418.5620 Ops/s $\color{#35bf28}+1.29\%$
test_func_call_runtime[False-compile] 2.5411ms 2.4097ms 414.9936 Ops/s 417.2169 Ops/s $\color{#d91a1a}-0.53\%$
test_func_call_runtime[False-compile-overhead] 0.4154ms 0.3599ms 2.7787 KOps/s 2.7557 KOps/s $\color{#35bf28}+0.83\%$
test_func_call_runtime[True-eager] 2.5917ms 2.5165ms 397.3713 Ops/s 393.3121 Ops/s $\color{#35bf28}+1.03\%$
test_func_call_runtime[True-compile] 2.4900ms 2.4192ms 413.3574 Ops/s 410.3641 Ops/s $\color{#35bf28}+0.73\%$
test_func_call_runtime[True-compile-overhead] 0.4345ms 0.3814ms 2.6217 KOps/s 2.5925 KOps/s $\color{#35bf28}+1.12\%$
test_func_call_cm_runtime[False-eager] 2.4451ms 2.3578ms 424.1280 Ops/s 423.7546 Ops/s $\color{#35bf28}+0.09\%$
test_func_call_cm_runtime[False-compile] 2.4984ms 2.3993ms 416.7904 Ops/s 412.9805 Ops/s $\color{#35bf28}+0.92\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4223ms 0.3621ms 2.7616 KOps/s 2.7432 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_cm_runtime[True-eager] 2.7639ms 2.6334ms 379.7352 Ops/s 376.8753 Ops/s $\color{#35bf28}+0.76\%$
test_func_call_cm_runtime[True-compile] 2.7004ms 2.4625ms 406.0948 Ops/s 408.3351 Ops/s $\color{#d91a1a}-0.55\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4551ms 0.4049ms 2.4698 KOps/s 2.4330 KOps/s $\color{#35bf28}+1.51\%$
test_vmap_func_call_cm_runtime[eager] 4.1670ms 3.7624ms 265.7881 Ops/s 265.1598 Ops/s $\color{#35bf28}+0.24\%$
test_vmap_func_call_cm_runtime[compile] 2.6021ms 2.4851ms 402.4037 Ops/s 405.8004 Ops/s $\color{#d91a1a}-0.84\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4959ms 0.4114ms 2.4309 KOps/s 2.4154 KOps/s $\color{#35bf28}+0.64\%$
test_distributed 3.2177ms 0.3131ms 3.1934 KOps/s 8.8422 KOps/s $\textbf{\color{#d91a1a}-63.88\%}$
test_tdmodule 0.1221ms 16.3919μs 61.0056 KOps/s 62.1633 KOps/s $\color{#d91a1a}-1.86\%$
test_tdmodule_dispatch 61.3530μs 31.9486μs 31.3003 KOps/s 32.5337 KOps/s $\color{#d91a1a}-3.79\%$
test_tdseq 36.4610μs 17.4014μs 57.4667 KOps/s 59.1538 KOps/s $\color{#d91a1a}-2.85\%$
test_tdseq_dispatch 58.0530μs 35.3585μs 28.2818 KOps/s 29.3953 KOps/s $\color{#d91a1a}-3.79\%$
test_instantiation_functorch 2.0347ms 1.8641ms 536.4628 Ops/s 519.8861 Ops/s $\color{#35bf28}+3.19\%$
test_instantiation_td 1.7880ms 1.1985ms 834.3578 Ops/s 814.3373 Ops/s $\color{#35bf28}+2.46\%$
test_exec_functorch 1.0614ms 1.0024ms 997.5735 Ops/s 993.7790 Ops/s $\color{#35bf28}+0.38\%$
test_exec_functional_call 1.1054ms 0.9991ms 1.0009 KOps/s 984.4036 Ops/s $\color{#35bf28}+1.67\%$
test_exec_td 1.1170ms 1.0339ms 967.1808 Ops/s 960.8452 Ops/s $\color{#35bf28}+0.66\%$
test_exec_td_decorator 1.7599ms 1.0576ms 945.5147 Ops/s 931.3298 Ops/s $\color{#35bf28}+1.52\%$
test_vmap_mlp_speed[True-True] 1.3624ms 1.2809ms 780.6784 Ops/s 777.0963 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed[True-False] 1.3859ms 1.2844ms 778.5715 Ops/s 782.0901 Ops/s $\color{#d91a1a}-0.45\%$
test_vmap_mlp_speed[False-True] 1.3403ms 1.1642ms 858.9348 Ops/s 860.3825 Ops/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed[False-False] 1.2542ms 1.1642ms 858.9887 Ops/s 856.9822 Ops/s $\color{#35bf28}+0.23\%$
test_vmap_mlp_speed_decorator[True-True] 1.9971ms 1.2538ms 797.5615 Ops/s 800.3945 Ops/s $\color{#d91a1a}-0.35\%$
test_vmap_mlp_speed_decorator[True-False] 1.3590ms 1.2545ms 797.1313 Ops/s 794.7663 Ops/s $\color{#35bf28}+0.30\%$
test_vmap_mlp_speed_decorator[False-True] 1.3367ms 1.1660ms 857.6344 Ops/s 858.6826 Ops/s $\color{#d91a1a}-0.12\%$
test_vmap_mlp_speed_decorator[False-False] 1.2639ms 1.1677ms 856.3572 Ops/s 860.8212 Ops/s $\color{#d91a1a}-0.52\%$
test_vmap_transformer_speed[True-True] 13.3609ms 13.1153ms 76.2470 Ops/s 75.3767 Ops/s $\color{#35bf28}+1.15\%$
test_vmap_transformer_speed[True-False] 13.2639ms 13.0704ms 76.5090 Ops/s 75.3251 Ops/s $\color{#35bf28}+1.57\%$
test_vmap_transformer_speed[False-True] 13.2810ms 12.8928ms 77.5629 Ops/s 76.8515 Ops/s $\color{#35bf28}+0.93\%$
test_vmap_transformer_speed[False-False] 12.9869ms 12.9219ms 77.3881 Ops/s 76.4952 Ops/s $\color{#35bf28}+1.17\%$
test_vmap_transformer_speed_decorator[True-True] 33.8700ms 33.6856ms 29.6863 Ops/s 29.5403 Ops/s $\color{#35bf28}+0.49\%$
test_vmap_transformer_speed_decorator[True-False] 34.0244ms 33.7194ms 29.6565 Ops/s 29.4989 Ops/s $\color{#35bf28}+0.53\%$
test_vmap_transformer_speed_decorator[False-True] 34.0312ms 33.6259ms 29.7390 Ops/s 29.5847 Ops/s $\color{#35bf28}+0.52\%$
test_vmap_transformer_speed_decorator[False-False] 33.7731ms 33.5842ms 29.7759 Ops/s 29.5997 Ops/s $\color{#35bf28}+0.60\%$
test_to_module_speed[True] 1.5358ms 0.9951ms 1.0049 KOps/s 993.2219 Ops/s $\color{#35bf28}+1.18\%$
test_to_module_speed[False] 1.4003ms 0.9632ms 1.0382 KOps/s 1.0206 KOps/s $\color{#35bf28}+1.73\%$
test_tc_init 60.0630μs 36.9607μs 27.0558 KOps/s 27.1113 KOps/s $\color{#d91a1a}-0.20\%$
test_tc_init_nested 0.1214ms 76.5382μs 13.0654 KOps/s 13.5149 KOps/s $\color{#d91a1a}-3.33\%$
test_tc_first_layer_tensor 5.9917μs 0.6750μs 1.4815 MOps/s 1.4676 MOps/s $\color{#35bf28}+0.95\%$
test_tc_first_layer_nontensor 23.6820μs 2.2735μs 439.8579 KOps/s 450.4643 KOps/s $\color{#d91a1a}-2.35\%$
test_tc_second_layer_tensor 17.7710μs 1.3764μs 726.5198 KOps/s 735.4019 KOps/s $\color{#d91a1a}-1.21\%$
test_tc_second_layer_nontensor 45.6120μs 2.9740μs 336.2518 KOps/s 339.8299 KOps/s $\color{#d91a1a}-1.05\%$
test_unbind 0.1982s 12.3887ms 80.7189 Ops/s 89.9488 Ops/s $\textbf{\color{#d91a1a}-10.26\%}$
test_full_like 0.6571ms 0.5723ms 1.7473 KOps/s 1.7441 KOps/s $\color{#35bf28}+0.19\%$
test_zeros_like 0.3265ms 0.1980ms 5.0516 KOps/s 5.0549 KOps/s $\color{#d91a1a}-0.06\%$
test_ones_like 0.2287ms 0.1978ms 5.0556 KOps/s 5.0589 KOps/s $\color{#d91a1a}-0.07\%$
test_clone 0.4509ms 0.4137ms 2.4170 KOps/s 2.4165 KOps/s $\color{#35bf28}+0.02\%$
test_squeeze 36.8420μs 9.8555μs 101.4665 KOps/s 98.1107 KOps/s $\color{#35bf28}+3.42\%$
test_unsqueeze 0.2365ms 73.8078μs 13.5487 KOps/s 12.8081 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_split 0.4073ms 0.1588ms 6.2965 KOps/s 6.0688 KOps/s $\color{#35bf28}+3.75\%$
test_permute 0.2328ms 0.1775ms 5.6328 KOps/s 5.4087 KOps/s $\color{#35bf28}+4.14\%$
test_stack 1.2547ms 0.8786ms 1.1381 KOps/s 1.1707 KOps/s $\color{#d91a1a}-2.78\%$
test_cat 1.2568ms 1.2314ms 812.0839 Ops/s 812.0048 Ops/s $+0.01\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Oct 4, 2024
[ghstack-poisoned]
@vmoens vmoens merged commit e2df712 into gh/vmoens/22/base Oct 4, 2024
50 of 55 checks passed
vmoens added a commit that referenced this pull request Oct 4, 2024
ghstack-source-id: 9e659036f70a1584a686453d4a4dd2c6a1cf932b
Pull Request resolved: #1021
@vmoens vmoens deleted the gh/vmoens/22/head branch October 4, 2024 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants