Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Allow creating extension arrays via pa.array constructor #44406

Closed
rok opened this issue Oct 14, 2024 · 2 comments
Closed

[Python] Allow creating extension arrays via pa.array constructor #44406

rok opened this issue Oct 14, 2024 · 2 comments

Comments

@rok
Copy link
Member

rok commented Oct 14, 2024

Describe the enhancement requested

As noted in #44070 (comment) - it would be nice if we could shorten:

storage = pa.array(data, type=storage_type)
extension_type = pa.some_extension_type(storage_type)
array = pa.ExtensionArray.from_storage(extension_type, storage)

to:

extension_type = pa.some_extension_type(storage_type)
array = pa.array(data, extension_type)

Component(s)

Python

@khwilson
Copy link
Contributor

I believe this already works. See https://github.com/apache/arrow/blob/main/python/pyarrow/array.pxi#L372 and this example:

from uuid import uuid4

import pyarrow as pa

class UuidType(pa.ExtensionType):

   def __init__(self):
      pa.ExtensionType.__init__(self, pa.binary(16), "my_package.uuid")

   def __arrow_ext_serialize__(self):
      # since we don't have a parameterized type, we don't need extra
      # metadata to be deserialized
      return b''

   @classmethod
   def __arrow_ext_deserialize__(self, storage_type, serialized):
      # return an instance of this subclass given the serialized
      # metadata.
      return UuidType()

def main():
    data = [uuid4().bytes]
    arr = pa.array(data, type=UuidType())
    print(arr)

if __name__ == "__main__":
    main()

@rok
Copy link
Member Author

rok commented Oct 15, 2024

Indeed it seem to have been covered by ARROW-17834. Thanks for noting this @khwilson !

@rok rok closed this as completed Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants