Skip to content

fabiana001/kafka_infrastructure_poc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News Stream Infrastructure

Stack version

News Stream Workflow

The following figure show the workflow of a News Stream Analyzer.

Data is initially stored in a mysql database and daily updated. Each time new data is inserted or updated in the mysql database:

  1. the Kafka Connect component polls it on a Kafka queue (on topic quickstart-jdbc-PRO_clip_repository). We set as polling strategy timestamp on the dataset attribute insertdate (for more information see here).
  2. The News Analyzer component enriches the input message with other information
  3. The enriched message is sent to a kafka (on topic news_genero);
  4. Finally the Kafka Connect component sends data having topic news_genero in the Elastic database.

Run Enviroment with Docker

The following commands run dockers with a mysql server with 200 data record, an ElasticSearch server, Kafka and Kafka connector

> cd ./kafka_infrastructure_poc/docker
> chmod +x launch_demo.sh
> ./launch_demo.sh

For running the news analyzer component:

> cd ./kafka_infrastructure_poc/src
> python -m ./consumers/news_analyzer

Web UIs

Kafka Topics UI

It allows user to : - browse Kafka topics and understand what's happening on the cluster; - find topics and their metadata; - browse kafka messages and download them.

Web UI will be available at localhost:8001

Kafka Schema Registry UI

It allow user to create, view, search and update Avro schemas of the Kafka cluster.

Web UI will be available at localhost:8000

Kafka Connect UI

This is a web tool for Kafka Connect for setting up and managing connectors for multiple connect clusters.

Web UI will be available at localhost:8002

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published