This project is a fault-tolerant distributed system for processing and storing real-time log events using Kafka, Redis, Docker, and Python. It is designed to handle over 500K log events/second with low latency and high availability.
- 🔁 Real-time log streaming with Kafka
- ⚡ Fast in-memory caching using Redis
- 📄 Persistent logging to file
- 🐳 Docker-based deployment
- 📉 40% latency reduction under peak loads
- Docker installed → Get Docker
- Docker Compose installed → usually included with Docker Desktop
-
Producer
- Simulates log generation (timestamp, level, message)
- Publishes logs to Kafka topic
logs
-
Kafka + Zookeeper
- Kafka acts as the message broker
- Zookeeper coordinates Kafka brokers
-
Consumer
- Subscribes to Kafka topic
logs - Stores each log in Redis and writes to
logs.txtfile
- Subscribes to Kafka topic
-
Redis
- Acts as a fast-access in-memory store for recent logs
- Python for producers and consumers
- Apache Kafka for log streaming
- Redis for caching
- Docker Compose for container orchestration
# 1. Clone the project
git clone <your-repo-url>
cd log_processing_system
# 2. Start the system
docker-compose up --buildlog_processing_system/
├── docker-compose.yml
├── requirements.txt
├── producer/
│ ├── producer.py
│ └── Dockerfile
├── consumer/
│ ├── consumer.py
│ ├── Dockerfile
│ └── logs/
│ └── logs.txt
└── README.md
Log in logs.txt:
{"timestamp": 1711234567.89, "level": "ERROR", "message": "Sample log message"}Console Output:
Cached and wrote log: {'timestamp': 1711234567.89, 'level': 'INFO', 'message': 'Sample log message'}
- Add log filtering and alerting
- Push logs to S3 or database
- Add Prometheus + Grafana for monitoring
- Web dashboard for log viewing
This was built as a hands-on project to learn:
- Real-time streaming
- Fault-tolerant systems
- Dockerized architecture
- Log handling under heavy load
This project is open-source under the MIT License.
Feel free to fork, explore, and modify this project! 💬 Need help? Just open an issue or ask.