Removed collect_content from PySparkS3Dataset #28

vringar · 2021-04-09T10:38:45Z

Downloading files via the SparkContext was much slower than
downloading via boto (which is what S3Dataset does.
So now both classes use the same method, as PySparkS3Dataset
inherits from S3Dataset

Downloading files via the SparkContext was much slower than downloading via boto (which is what S3Dataset does. So now both classes use the same method, as PySparkS3Dataset inherits from S3Dataset

Removed collect_content from PySparkS3Dataset

33bb9a2

Downloading files via the SparkContext was much slower than downloading via boto (which is what S3Dataset does. So now both classes use the same method, as PySparkS3Dataset inherits from S3Dataset

vringar requested a review from englehardt April 9, 2021 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removed collect_content from PySparkS3Dataset #28

Removed collect_content from PySparkS3Dataset #28

vringar commented Apr 9, 2021

Removed collect_content from PySparkS3Dataset #28

Are you sure you want to change the base?

Removed collect_content from PySparkS3Dataset #28

Conversation

vringar commented Apr 9, 2021