Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move external data to memory mapped files #72

Open
tmaiaroto opened this issue Jun 17, 2015 · 0 comments
Open

Move external data to memory mapped files #72

tmaiaroto opened this issue Jun 17, 2015 · 0 comments
Milestone

Comments

@tmaiaroto
Copy link
Member

This will be a change that benefits packages in other repos as well, but Social Harvest is prompting it.

Geocoding and sentiment analysis both need to use some data sets. These are pulled from S3 right now (too big to store in GitHub) upon running Social Harvest (if the files don't exist). The problem is they are rather large and therefore require a good deal of RAM to load and work with.

By using memory mapped files (I'm looking at boltdb), it should work on a server of any size...But work faster when there's more RAM available of course. Despite the slow performance on smaller servers, it still may allow Social Harvest to run and run fast enough for many use cases.

One of the goals of Social Harvest is to bring big data in social media analytics down to an affordable and obtainable goal. So this is important, though for the time being it is also easy enough to just run Social Harvest on a server with 1 or 2GB of RAM rather than 256MB or 512MB. My goal is to make the minimum requirement 512MB of RAM. I would like Social Harvest to run on an EC2 small instance. A micro instance may be asking too much.

@tmaiaroto tmaiaroto added this to the Beta milestone Jun 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant