Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA error when following the MopeyMule notebook. #26

Open
psych0v0yager opened this issue Jul 16, 2024 · 0 comments
Open

CUDA error when following the MopeyMule notebook. #26

psych0v0yager opened this issue Jul 16, 2024 · 0 comments

Comments

@psych0v0yager
Copy link

I am attempting to use the following notebook

https://huggingface.co/failspy/Llama-3-8B-Instruct-MopeyMule/blob/main/MopeyMule-Induce-Melancholy.ipynb

I receive the following error when I run the following parts of the notebook

baseline_cache = ortho.create_activation_cache(baseline,N=len(baseline))
eeyore_cache = ortho.create_activation_cache(eeyored_toks,N=len(eeyored_toks))

Error

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)`

Verbose error log

  0%|          | 0/128 [00:00<?, ?it/s]../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [96,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [97,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [98,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [99,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [100,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [101,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [102,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [103,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [104,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [105,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [106,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [107,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [108,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [109,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [110,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [111,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [112,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [113,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [114,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [115,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [116,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [117,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [118,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [119,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [120,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [121,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [122,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [123,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [124,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [125,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [126,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [73,0,0], thread: [127,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
  0%|          | 0/128 [00:00<?, ?it/s]




---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[6], [line 1](vscode-notebook-cell:?execution_count=6&line=1)
----> [1](vscode-notebook-cell:?execution_count=6&line=1) baseline_cache = ortho.create_activation_cache(baseline,N=len(baseline))
      [2](vscode-notebook-cell:?execution_count=6&line=2) eeyore_cache = ortho.create_activation_cache(eeyored_toks,N=len(eeyored_toks))

File ~/Desktop/AbliterationExperiments/abliterator/abliterator.py:613, in ModelAbliterator.create_activation_cache(self, toks, N, batch_size, last_indices, measure_refusal, stop_at_layer)
    [611](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:611) z_label = [] if measure_refusal > 1 else None
    [612](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:612) for i in tqdm(range(0,min(N,len(toks)),batch_size)):
--> [613](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:613)     logits,cache = self.run_with_cache(toks[i:min(i+batch_size,len(toks))],max_new_tokens=measure_refusal,stop_at_layer=stop_at_layer)
    [614](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:614)     if measure_refusal > 1:
    [615](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:615)         z_label.extend(self.measure_scores_from_logits(logits,measure_refusal)[0])

File ~/Desktop/AbliterationExperiments/abliterator/abliterator.py:396, in ModelAbliterator.run_with_cache(self, names_filter, incl_bwd, device, remove_batch_dim, reset_hooks_end, clear_contexts, fwd_hooks, max_new_tokens, *model_args, **model_kwargs)
    [392](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:392)     max_new_tokens = 1
    [394](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:394) with self.model.hooks(fwd_hooks=fwd_hooks, bwd_hooks=bwd, reset_hooks_end=reset_hooks_end, clear_contexts=clear_contexts):
    [395](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:395)     #model_out = self.model(*model_args,**model_kwargs)
--> [396](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:396)     model_out,toks = self.generate_logits(*model_args,max_tokens_generated=max_new_tokens, **model_kwargs)
    [397](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:397)     if incl_bwd:
    [398](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:398)         model_out.backward()

File ~/Desktop/AbliterationExperiments/abliterator/abliterator.py:317, in ModelAbliterator.generate_logits(self, toks, drop_refusals, stop_at_eos, max_tokens_generated, *args, **kwargs)
    [315](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:315) generating = [i for i in range(toks.shape[0])]
    [316](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:316) for i in range(max_tokens_generated):
--> [317](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:317)     logits = self.model(all_toks[generating, :-max_tokens_generated + i],*args,**kwargs)
    [318](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:318)     next_tokens = logits[:,-1,:].argmax(dim=-1).to('cpu')
    [319](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/Desktop/AbliterationExperiments/abliterator/abliterator.py:319)     all_toks[generating,-max_tokens_generated+i] = next_tokens

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs)
   [1530](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1530)     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   [1531](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1531) else:
-> [1532](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532)     return self._call_impl(*args, **kwargs)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541, in Module._call_impl(self, *args, **kwargs)
   [1536](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1536) # If we don't have any hooks, we want to skip the rest of the logic in
   [1537](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1537) # this function, and just call forward.
   [1538](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1538) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1539](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1539)         or _global_backward_pre_hooks or _global_backward_hooks
   [1540](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1540)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1541](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541)     return forward_call(*args, **kwargs)
   [1543](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1543) try:
   [1544](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1544)     result = None

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:550, in HookedTransformer.forward(self, input, return_type, loss_per_token, prepend_bos, padding_side, start_at_layer, tokens, shortformer_pos_embed, attention_mask, stop_at_layer, past_kv_cache)
    [545](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:545)     if shortformer_pos_embed is not None:
    [546](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:546)         shortformer_pos_embed = shortformer_pos_embed.to(
    [547](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:547)             devices.get_device_for_block_index(i, self.cfg)
    [548](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:548)         )
--> [550](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:550)     residual = block(
    [551](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:551)         residual,
    [552](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:552)         # Cache contains a list of HookedTransformerKeyValueCache objects, one for each
    [553](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:553)         # block
    [554](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:554)         past_kv_cache_entry=past_kv_cache[i] if past_kv_cache is not None else None,
    [555](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:555)         shortformer_pos_embed=shortformer_pos_embed,
    [556](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:556)         attention_mask=attention_mask,
    [557](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:557)     )  # [batch, pos, d_model]
    [559](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:559) if stop_at_layer is not None:
    [560](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:560)     # When we stop at an early layer, we end here rather than doing further computation
    [561](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/HookedTransformer.py:561)     return residual

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs)
   [1530](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1530)     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   [1531](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1531) else:
-> [1532](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532)     return self._call_impl(*args, **kwargs)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541, in Module._call_impl(self, *args, **kwargs)
   [1536](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1536) # If we don't have any hooks, we want to skip the rest of the logic in
   [1537](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1537) # this function, and just call forward.
   [1538](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1538) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1539](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1539)         or _global_backward_pre_hooks or _global_backward_hooks
   [1540](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1540)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1541](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541)     return forward_call(*args, **kwargs)
   [1543](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1543) try:
   [1544](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1544)     result = None

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:159, in TransformerBlock.forward(self, resid_pre, shortformer_pos_embed, past_kv_cache_entry, attention_mask)
    [152](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:152)     key_input = attn_in
    [153](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:153)     value_input = attn_in
    [155](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:155) attn_out = (
    [156](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:156)     # hook the residual stream states that are used to calculate the
    [157](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:157)     # queries, keys and values, independently.
    [158](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:158)     # Then take the layer norm of these inputs, and pass these to the attention module.
--> [159](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:159)     self.attn(
    [160](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:160)         query_input=self.ln1(query_input)
    [161](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:161)         + (0.0 if shortformer_pos_embed is None else shortformer_pos_embed),
    [162](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:162)         key_input=self.ln1(key_input)
    [163](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:163)         + (0.0 if shortformer_pos_embed is None else shortformer_pos_embed),
    [164](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:164)         value_input=self.ln1(value_input),
    [165](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:165)         past_kv_cache_entry=past_kv_cache_entry,
    [166](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:166)         attention_mask=attention_mask,
    [167](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:167)     )
    [168](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:168) )  # [batch, pos, d_model]
    [169](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:169) if self.cfg.use_normalization_before_and_after:
    [170](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:170)     # If we use LayerNorm both before and after, then apply the second LN after the layer
    [171](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:171)     # and before the hook. We do it before the hook so hook_attn_out captures "that which
    [172](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:172)     # is added to the residual stream"
    [173](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/transformer_block.py:173)     attn_out = self.ln1_post(attn_out)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs)
   [1530](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1530)     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   [1531](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1531) else:
-> [1532](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1532)     return self._call_impl(*args, **kwargs)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541, in Module._call_impl(self, *args, **kwargs)
   [1536](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1536) # If we don't have any hooks, we want to skip the rest of the logic in
   [1537](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1537) # this function, and just call forward.
   [1538](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1538) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1539](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1539)         or _global_backward_pre_hooks or _global_backward_hooks
   [1540](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1540)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1541](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1541)     return forward_call(*args, **kwargs)
   [1543](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1543) try:
   [1544](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/torch/nn/modules/module.py:1544)     result = None

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:216, in AbstractAttention.forward(self, query_input, key_input, value_input, past_kv_cache_entry, additive_attention_mask, attention_mask, position_bias)
    [213](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:213)     q = q.to(torch.float32)
    [214](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:214)     k = k.to(torch.float32)
--> [216](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:216) attn_scores = self.calculate_attention_scores(
    [217](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:217)     q, k
    [218](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:218) )  # [batch, head_index, query_pos, key_pos]
    [220](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:220) if self.cfg.positional_embedding_type == "alibi":
    [221](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:221)     query_ctx = attn_scores.size(-2)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:153, in GroupedQueryAttention.calculate_attention_scores(self, q, k)
    [142](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:142) """Calculate attention scores from Q and the unexpanded K matrix.
    [143](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:143) K will be expaned from [batch, pos, n_key_value_head, d_head] to [batch, pos, n_query_heads, d_head] using torch.repeat_interleave.
    [144](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:144) 
   (...)
    [150](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:150)     Float[torch.Tensor, "batch head_index query_pos key_pos"]: The attention scores.
    [151](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:151) """
    [152](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:152) k = torch.repeat_interleave(k, dim=2, repeats=self.repeat_kv_heads)
--> [153](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/grouped_query_attention.py:153) return super().calculate_attention_scores(q, k)

File ~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:409, in AbstractAttention.calculate_attention_scores(self, q, k)
    [403](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:403) q_ = einops.rearrange(
    [404](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:404)     q, "batch query_pos head_index d_head -> batch head_index query_pos d_head"
    [405](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:405) )
    [406](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:406) k_ = einops.rearrange(
    [407](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:407)     k, "batch key_pos head_index d_head -> batch head_index d_head key_pos"
    [408](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:408) )
--> [409](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:409) attn_scores = q_ @ k_ / self.attn_scale
    [410](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:410) if self.cfg.attn_scores_soft_cap > 0:
    [411](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:411)     attn_scores = self.cfg.attn_scores_soft_cap * F.tanh(
    [412](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:412)         attn_scores / self.cfg.attn_scores_soft_cap
    [413](https://file+.vscode-resource.vscode-cdn.net/home/server_runner/Desktop/AbliterationExperiments/abliterator/~/anaconda3/envs/abliteratorENV2/lib/python3.11/site-packages/transformer_lens/components/abstract_attention.py:413)     )

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)`

I have an ubuntu 22.04 rig with 2x 3090s. Nvidia SMI is located below

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090        On  | 00000000:01:00.0  On |                  N/A |
| 56%   69C    P2             126W / 350W |   9744MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090 Ti     On  | 00000000:06:00.0 Off |                  Off |
|  0%   39C    P8              16W / 480W |  10116MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1699      G   /usr/lib/xorg/Xorg                          199MiB |
|    0   N/A  N/A      1878      G   /usr/bin/gnome-shell                         72MiB |
|    0   N/A  N/A      2482      G   ...erProcess --variations-seed-version      127MiB |
|    0   N/A  N/A      3354      G   ...irefox/4539/usr/lib/firefox/firefox      175MiB |
|    0   N/A  N/A      6859      C   ...da3/envs/abliteratorENV2/bin/python     9138MiB |
|    1   N/A  N/A      1699      G   /usr/lib/xorg/Xorg                            4MiB |
|    1   N/A  N/A      6859      C   ...da3/envs/abliteratorENV2/bin/python    10098MiB |
+---------------------------------------------------------------------------------------+

I am running abliterator inside a conda environment. The CUDA version associated with pytorch inside my environment is different than the system CUDA (12.1 vs 12.3). Could this be the source of the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant