[BUG] NonTensorData behavior with equal data is not transparent to the rest of the library #831
Closed
3 tasks done
Labels
bug
Something isn't working
Describe the bug
NonTensorData._stack_non_tensor(...)
will create aNonTensorData
object instead of aNonTensorStack
if all elements passed to the method are equal.There is nothing wrong with the idea, however even though a call to
.tolist()
will produce the same output many parts of the library (and especially torchrl) can't handle this behavior.torch.stack
is not defined forNonTensorStack
andNonTensorData
The code with throw an exception
Even though both data and stack have the same
batch_size
and the output of.tolist()
is a list of two elements, the two elements cannot be concatenated. This happens in a more practical example with torchrl when the tensordicts of each time step get stacked along the time axis.For example, if my observation is a list of two non-tensor items and by chance in any of the time steps the two items are equal. Then the tensordict will store a
NonTensorData
object for this time step, which will trigger the above-mentioned torch.cat exception at the end of the rollout.(The same problem occurs for
torch.cat
).LazyStackedTensorDict.where
fails with equivalentNonTensorData
even though it is passed in as aNonTensorStack
The code will produce an exception:
Again, the problem comes down to a
NonTensorData
and aNonTensorStack
of the samebatch_size
but containing equal elements.Note that the input for the
where(...)
function are allNonTensorStack
s. However, withinLazyStackedTensorDict.where
after the condition is passed further down the results are reconstructed usingmaybe_dense_stack
which will return aNonTensorData
:Again, I stumbled over this unexpected behavior by using
torchrl
. In particular, using environment.rollout(5, break_when_any_done=False)in which at some point one but not all batch-entries is done. In
EnvBase._update_during_resetthe call to
node.where(~reset, other=node_reset, out=node, pad=0)` will trigger the above-System info
Describe the characteristic of your environment:
0.4.0 1.26.4 3.10.14 (main, Mar 21 2024, 11:21:31) [Clang 14.0.6 ] darwin 2.3.0
Checklist
The text was updated successfully, but these errors were encountered: