We can analyze and visualize different types of streaming data as the information arrives.
The producers don't change from buzzline-03-case - they write the same information to a Kafka topic, except the csv producer for the smart smoker has been modified to not run continuously. It will stop after reading all the rows in the CSV file. The consumers have been enhanced to add visualization.
This project uses matplotlib and its animation capabilities for visualization.
It generates three applications:
- A basic producer and consumer that exchange information via a dynamically updated file.
- A JSON producer and consumer that exchange information via a Kafka topic.
- A CSV producer and consumer that exchange information via a different Kafka topic.
All three applications produce live charts to illustrate the data.
Before starting, ensure you have completed the setup tasks in https://github.com/denisecase/buzzline-01-case and https://github.com/denisecase/buzzline-02-case first. Python 3.11 is required.
Once the tools are installed, copy/fork this project into your GitHub account and create your own version of this project to run and experiment with. Follow the instructions in FORK-THIS-REPO.md.
OR: For more practice, add these example scripts or features to your earlier project. You'll want to check requirements.txt, .env, and the consumers, producers, and util folders. Use your README.md to record your workflow and commands.
Follow the instructions in MANAGE-VENV.md to:
- Create your .venv
- Activate .venv
- Install the required dependencies using requirements.txt.
If Zookeeper and Kafka are not already running, you'll need to restart them. See instructions at [SETUP-KAFKA.md] to:
This will take two terminals:
- One to run the producer which writes to a file in the data folder.
- Another to run the consumer which reads from the dynamically updated file.
Start the producer to generate the messages.
In VS Code, open a NEW terminal. Use the commands below to activate .venv, and start the producer.
Windows:
.venv\Scripts\activate
py -m producers.basic_json_producer_case
Mac/Linux:
source .venv/bin/activate
python3 -m producers.basic_json_producer_case
Start the associated consumer that will process and visualize the messages.
In VS Code, open a NEW terminal in your root project folder. Use the commands below to activate .venv, and start the consumer.
Windows:
.venv\Scripts\activate
py -m consumers.basic_json_consumer_case
Mac/Linux:
source .venv/bin/activate
python3 -m consumers.basic_json_consumer_case
Review the code for both the producer and the consumer. Understand how the information is generated, written to a file, and read and processed. Review the visualization code to see how the live chart is produced. When done, remember to kill the associated terminals for the producer and consumer.
This will take two terminals:
- One to run the producer which writes to a Kafka topic.
- Another to run the consumer which reads from that Kafka topic.
For each one, you will need to:
- Open a new terminal.
- Activate your .venv.
- Know the command that works on your machine to execute python (e.g. py or python3).
- Know how to use the -m (module flag to run your file as a module).
- Know the full name of the module you want to run.
- Look in the producers folder for json_producer_case.
- Look in the consumers folder for json_consumer_case.
Review the code for both the producer and the consumer. Understand how the information is generated and written to a Kafka topic, and consumed from the topic and processed. Review the visualization code to see how the live chart is produced.
Compare the non-Kafka JSON streaming application to the Kafka JSON streaming application. By organizing code into reusable functions, which functions can be reused? Which functions must be updated based on the sharing mechanism? What new functions/features must be added to work with a Kafka-based streaming system?
When done, remember to kill the associated terminals for the producer and consumer.
This will take two terminals:
- One to run the producer which writes to a Kafka topic.
- Another to run the consumer which reads from that Kafka topic.
For each one, you will need to:
- Open a new terminal.
- Activate your .venv.
- Know the command that works on your machine to execute python (e.g. py or python3).
- Know how to use the -m (module flag to run your file as a module).
- Know the full name of the module you want to run.
- Look in the producers folder for csv_producer_case.
- Look in the consumers folder for csv_consumer_case.
Review the code for both the producer and the consumer. Understand how the information is generated and written to a Kafka topic, and consumed from the topic and processed. Review the visualization code to see how the live chart is produced.
Compare the JSON application to the CSV streaming application. By organizing code into reusable functions, which functions can be reused? Which functions must be updated based on the type of data? How does the visualization code get changed based on the type of data and type of chart used? Which aspects are similar between the different types of data?
When done, remember to kill the associated terminals for the producer and consumer.
- JSON: Process messages in batches of 5 messages.
- JSON:Limit the display to the top 3 authors.
- Modify chart appearance.
- Stream a different set of data and visualize the custom stream with an appropriate chart.
- How do we find out what types of charts are available?
- How do we find out what attributes and colors are available?
When resuming work on this project:
- Open the folder in VS Code.
- Start the Zookeeper service.
- Start the Kafka service.
- Activate your local project virtual environment (.env).
To save disk space, you can delete the .venv folder when not actively working on this project. You can always recreate it, activate it, and reinstall the necessary packages later. Managing Python virtual environments is a valuable skill.
This project is licensed under the MIT License as an example project. You are encouraged to fork, copy, explore, and modify the code as you like. See the LICENSE file for more.
Live Bar Chart (JSON file streaming)
Live Bar Chart (Kafka JSON streaming)
Live Line Chart with Alert (Kafka CSV streaming)