We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Following is the training log.
dnnlib: Running training.training_loop.training_loop() on localhost... GPU available: True GPU devices: /device:GPU:0 >>>>> Create Session Dataset directory: . Streaming data using training.dataset.TFRecordDataset... tfrecord_dir: .\custom-images Dataset shape = [1, 512, 512] Dynamic range = [0, 255] Label size = 0 Constructing networks... G Params OutputShape WeightShape --- --- --- --- latents_in - (?, 512) - labels_in - (?, 0) - lod - () - dlatent_avg - (512,) - G_mapping/latents_in - (?, 512) - G_mapping/labels_in - (?, 0) - G_mapping/PixelNorm - (?, 512) - G_mapping/Dense0 262656 (?, 512) (512, 512) G_mapping/Dense1 262656 (?, 512) (512, 512) G_mapping/Dense2 262656 (?, 512) (512, 512) G_mapping/Dense3 262656 (?, 512) (512, 512) G_mapping/Dense4 262656 (?, 512) (512, 512) G_mapping/Dense5 262656 (?, 512) (512, 512) G_mapping/Dense6 262656 (?, 512) (512, 512) G_mapping/Dense7 4202496 (?, 8192) (512, 8192) G_mapping/Reshape - (?, 16, 512) - G_mapping/dlatents_out - (?, 16, 512) - Truncation - (?, 16, 512) - G_synthesis/dlatents_in - (?, 16, 512) - G_synthesis/4x4/Const 534528 (?, 512, 4, 4) (512,) G_synthesis/4x4/Conv 2885632 (?, 512, 4, 4) (3, 3, 512, 512) G_synthesis/ToRGB_lod7 513 (?, 1, 4, 4) (1, 1, 512, 1) G_synthesis/8x8/Conv0_up 2885632 (?, 512, 8, 8) (3, 3, 512, 512) G_synthesis/8x8/Conv1 2885632 (?, 512, 8, 8) (3, 3, 512, 512) G_synthesis/ToRGB_lod6 513 (?, 1, 8, 8) (1, 1, 512, 1) G_synthesis/Upscale2D - (?, 1, 8, 8) - G_synthesis/Grow_lod6 - (?, 1, 8, 8) - G_synthesis/16x16/Conv0_up 2885632 (?, 512, 16, 16) (3, 3, 512, 512) G_synthesis/16x16/Conv1 2885632 (?, 512, 16, 16) (3, 3, 512, 512) G_synthesis/ToRGB_lod5 513 (?, 1, 16, 16) (1, 1, 512, 1) G_synthesis/Upscale2D_1 - (?, 1, 16, 16) - G_synthesis/Grow_lod5 - (?, 1, 16, 16) - G_synthesis/32x32/Conv0_up 2885632 (?, 512, 32, 32) (3, 3, 512, 512) G_synthesis/32x32/Conv1 2885632 (?, 512, 32, 32) (3, 3, 512, 512) G_synthesis/ToRGB_lod4 513 (?, 1, 32, 32) (1, 1, 512, 1) G_synthesis/Upscale2D_2 - (?, 1, 32, 32) - G_synthesis/Grow_lod4 - (?, 1, 32, 32) - G_synthesis/64x64/Conv0_up 1442816 (?, 256, 64, 64) (3, 3, 512, 256) G_synthesis/64x64/Conv1 852992 (?, 256, 64, 64) (3, 3, 256, 256) G_synthesis/ToRGB_lod3 257 (?, 1, 64, 64) (1, 1, 256, 1) G_synthesis/Upscale2D_3 - (?, 1, 64, 64) - G_synthesis/Grow_lod3 - (?, 1, 64, 64) - G_synthesis/128x128/Conv0_up 426496 (?, 128, 128, 128) (3, 3, 256, 128) G_synthesis/128x128/Conv1 279040 (?, 128, 128, 128) (3, 3, 128, 128) G_synthesis/ToRGB_lod2 129 (?, 1, 128, 128) (1, 1, 128, 1) G_synthesis/Upscale2D_4 - (?, 1, 128, 128) - G_synthesis/Grow_lod2 - (?, 1, 128, 128) - G_synthesis/256x256/Conv0_up 139520 (?, 64, 256, 256) (3, 3, 128, 64) G_synthesis/256x256/Conv1 102656 (?, 64, 256, 256) (3, 3, 64, 64) G_synthesis/ToRGB_lod1 65 (?, 1, 256, 256) (1, 1, 64, 1) G_synthesis/Upscale2D_5 - (?, 1, 256, 256) - G_synthesis/Grow_lod1 - (?, 1, 256, 256) - G_synthesis/512x512/Conv0_up 51328 (?, 32, 512, 512) (3, 3, 64, 32) G_synthesis/512x512/Conv1 42112 (?, 32, 512, 512) (3, 3, 32, 32) G_synthesis/ToRGB_lod0 33 (?, 1, 512, 512) (1, 1, 32, 1) G_synthesis/Upscale2D_6 - (?, 1, 512, 512) - G_synthesis/Grow_lod0 - (?, 1, 512, 512) - G_synthesis/images_out - (?, 1, 512, 512) - G_synthesis/lod - () - G_synthesis/noise0 - (1, 1, 4, 4) - G_synthesis/noise1 - (1, 1, 4, 4) - G_synthesis/noise2 - (1, 1, 8, 8) - G_synthesis/noise3 - (1, 1, 8, 8) - G_synthesis/noise4 - (1, 1, 16, 16) - G_synthesis/noise5 - (1, 1, 16, 16) - G_synthesis/noise6 - (1, 1, 32, 32) - G_synthesis/noise7 - (1, 1, 32, 32) - G_synthesis/noise8 - (1, 1, 64, 64) - G_synthesis/noise9 - (1, 1, 64, 64) - G_synthesis/noise10 - (1, 1, 128, 128) - G_synthesis/noise11 - (1, 1, 128, 128) - G_synthesis/noise12 - (1, 1, 256, 256) - G_synthesis/noise13 - (1, 1, 256, 256) - G_synthesis/noise14 - (1, 1, 512, 512) - G_synthesis/noise15 - (1, 1, 512, 512) - images_out - (?, 1, 512, 512) - --- --- --- --- Total 30114536 D Params OutputShape WeightShape --- --- --- --- images_in - (?, 1, 512, 512) - labels_in - (?, 0) - lod - () - FromRGB_lod0 64 (?, 32, 512, 512) (1, 1, 1, 32) 512x512/Conv0 9248 (?, 32, 512, 512) (3, 3, 32, 32) 512x512/Conv1_down 18496 (?, 64, 256, 256) (3, 3, 32, 64) Downscale2D - (?, 1, 256, 256) - FromRGB_lod1 128 (?, 64, 256, 256) (1, 1, 1, 64) Grow_lod0 - (?, 64, 256, 256) - 256x256/Conv0 36928 (?, 64, 256, 256) (3, 3, 64, 64) 256x256/Conv1_down 73856 (?, 128, 128, 128) (3, 3, 64, 128) Downscale2D_1 - (?, 1, 128, 128) - FromRGB_lod2 256 (?, 128, 128, 128) (1, 1, 1, 128) Grow_lod1 - (?, 128, 128, 128) - 128x128/Conv0 147584 (?, 128, 128, 128) (3, 3, 128, 128) 128x128/Conv1_down 295168 (?, 256, 64, 64) (3, 3, 128, 256) Downscale2D_2 - (?, 1, 64, 64) - FromRGB_lod3 512 (?, 256, 64, 64) (1, 1, 1, 256) Grow_lod2 - (?, 256, 64, 64) - 64x64/Conv0 590080 (?, 256, 64, 64) (3, 3, 256, 256) 64x64/Conv1_down 1180160 (?, 512, 32, 32) (3, 3, 256, 512) Downscale2D_3 - (?, 1, 32, 32) - FromRGB_lod4 1024 (?, 512, 32, 32) (1, 1, 1, 512) Grow_lod3 - (?, 512, 32, 32) - 32x32/Conv0 2359808 (?, 512, 32, 32) (3, 3, 512, 512) 32x32/Conv1_down 2359808 (?, 512, 16, 16) (3, 3, 512, 512) Downscale2D_4 - (?, 1, 16, 16) - FromRGB_lod5 1024 (?, 512, 16, 16) (1, 1, 1, 512) Grow_lod4 - (?, 512, 16, 16) - 16x16/Conv0 2359808 (?, 512, 16, 16) (3, 3, 512, 512) 16x16/Conv1_down 2359808 (?, 512, 8, 8) (3, 3, 512, 512) Downscale2D_5 - (?, 1, 8, 8) - FromRGB_lod6 1024 (?, 512, 8, 8) (1, 1, 1, 512) Grow_lod5 - (?, 512, 8, 8) - 8x8/Conv0 2359808 (?, 512, 8, 8) (3, 3, 512, 512) 8x8/Conv1_down 2359808 (?, 512, 4, 4) (3, 3, 512, 512) Downscale2D_6 - (?, 1, 4, 4) - FromRGB_lod7 1024 (?, 512, 4, 4) (1, 1, 1, 512) Grow_lod6 - (?, 512, 4, 4) - 4x4/MinibatchStddev - (?, 513, 4, 4) - 4x4/Conv 2364416 (?, 512, 4, 4) (3, 3, 513, 512) 4x4/Dense0 4194816 (?, 512) (8192, 512) 4x4/Dense1 513 (?, 1) (512, 1) scores_out - (?, 1) - --- --- --- --- Total 23075169 Building TensorFlow graph... Setting up snapshot image grid... Setting up run dir... Training... Traceback (most recent call last): File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call return fn(*args) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16 [[{{node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1}} = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]] [[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "train.py", line 193, in <module> main() File "train.py", line 188, in main dnnlib.submit_run(**kwargs) File "D:\dy\idinvert\dnnlib\submission\submit.py", line 290, in submit_run run_wrapper(submit_config) File "D:\dy\idinvert\dnnlib\submission\submit.py", line 242, in run_wrapper util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs) File "D:\dy\idinvert\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "D:\dy\idinvert\training\training_loop.py", line 231, in training_loop tflib.run([D_train_op, Gs_update_op], {lod_in: sched.lod, lrate_in: sched.D_lrate, minibatch_in: sched.minibatch}) File "D:\dy\idinvert\dnnlib\tflib\tfutil.py", line 26, in run return tf.get_default_session().run(*args, **kwargs) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 929, in run run_metadata_ptr) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run run_metadata) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16 [[node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1 (defined at D:\dy\idinvert\dnnlib\tflib\optimizer.py:98) = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]] [[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Caused by op 'GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1', defined at: File "train.py", line 193, in <module> main() File "train.py", line 188, in main dnnlib.submit_run(**kwargs) File "D:\dy\idinvert\dnnlib\submission\submit.py", line 290, in submit_run run_wrapper(submit_config) File "D:\dy\idinvert\dnnlib\submission\submit.py", line 242, in run_wrapper util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs) File "D:\dy\idinvert\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "D:\dy\idinvert\training\training_loop.py", line 184, in training_loop D_opt.register_gradients(tf.reduce_mean(D_loss), D_gpu.trainables) File "D:\dy\idinvert\dnnlib\tflib\optimizer.py", line 98, in register_gradients grads = self._dev_opt[dev].compute_gradients(loss, trainable_vars, gate_gradients=tf.train.Optimizer.GATE_NONE) # disable gating to reduce memory usage File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\training\optimizer.py", line 519, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 630, in gradients gate_gradients, aggregation_method, stop_gradients) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 814, in _GradientsHelper lambda: grad_fn(op, *out_grads)) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 408, in _MaybeCompile return grad_fn() # Exit early File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 814, in <lambda> lambda: grad_fn(op, *out_grads)) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\math_grad.py", line 1131, in _MatMulGrad grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 4560, in mat_mul name=name) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, **kwargs) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op op_def=op_def) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__ self._traceback = tf_stack.extract_stack() ...which was originally created as op 'GPU0/D_loss/D_1/4x4/Dense0/MatMul', defined at: File "train.py", line 193, in <module> main() [elided 3 identical lines from previous traceback] File "D:\dy\idinvert\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "D:\dy\idinvert\training\training_loop.py", line 182, in training_loop D_loss = dnnlib.util.call_func_by_name(G=G_gpu, D=D_gpu, opt=D_opt, training_set=training_set, minibatch_size=minibatch_split, reals=reals, labels=labels, **D_loss_args) File "D:\dy\idinvert\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "D:\dy\idinvert\training\loss.py", line 154, in D_logistic_simplegp fake_scores_out = fp32(D.get_output_for(fake_images_out, labels, is_training=True)) File "D:\dy\idinvert\dnnlib\tflib\network.py", line 222, in get_output_for out_expr = self._build_func(*final_inputs, **build_kwargs) File "D:\dy\idinvert\training\networks_stylegan.py", line 654, in D_basic scores_out = grow(2, resolution_log2 - 2) File "D:\dy\idinvert\training\networks_stylegan.py", line 651, in grow x = block(x(), res); y = lambda: x File "D:\dy\idinvert\training\networks_stylegan.py", line 619, in block x = act(apply_bias(dense(x, fmaps=nf(res-2), gain=gain, use_wscale=use_wscale))) File "D:\dy\idinvert\training\networks_stylegan.py", line 159, in dense return tf.matmul(x, w) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\math_ops.py", line 2057, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 4560, in mat_mul name=name) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, **kwargs) File "D:\Unet\anaconda\envs\tf112\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op op_def=op_def) InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(16, 8192), b.shape=(16, 512), m=8192, n=512, k=16 [[node GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/MatMul_grad/MatMul_1 (defined at D:\dy\idinvert\dnnlib\tflib\optimizer.py:98) = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU0/D_loss/D_1/4x4/Dense0/Reshape, GPU0/TrainD_grad/gradients/GPU0/D_loss/D_1/4x4/Dense0/add_grad/Reshape)]] [[{{node TrainD/ApplyGrads0/UpdateWeights/cond/pred_id/_1585}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_24297_TrainD/ApplyGrads0/UpdateWeights/cond/pred_id", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
is this problem caused by large batch_size? but when i turn down the batch_size ,the problem is still occured.
The text was updated successfully, but these errors were encountered:
You can try on the images with the resolution of 256x256 and see if the problem still happens.
Sorry, something went wrong.
the problem still occured.
Your environment may cause it. I find some solutions, such as here and here, and see if these can help.
No branches or pull requests
Following is the training log.
is this problem caused by large batch_size? but when i turn down the batch_size ,the problem is still occured.
The text was updated successfully, but these errors were encountered: