Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSS content collector #89

Open
Andras-Csanyi opened this issue Aug 29, 2020 · 0 comments
Open

RSS content collector #89

Andras-Csanyi opened this issue Aug 29, 2020 · 0 comments

Comments

@Andras-Csanyi
Copy link
Owner

Summary
I need the capability to collect RSS contents from different sources. The reason I need these is collecting information and knowledge, and later text analysis.

Solution
A service which periodically looks out for the defined sources for new content. If there is new content then it collects them.
List of sources stored in database.
List of content by sources and flagging whether it is already collected or not.
Scheduling options.
The original content is stored in db as raw data.
Language is marked.
Original content is processed by another process and broken down according to our thesis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant