Case study 03 · Lab — Automation

From 1.5 hours of manual sorting to a 2-minute review — fully automated with AI.

A fully autonomous Telegram bot that watches telecom field-crew topics 24/7, sorts thousands of equipment photos with a two-stage Gemini pipeline, and ships a finished report back — without a single click.

Role
Solo — design & build
Domain
Telecom field-ops reporting
Core stack
Python · Gemini · Telethon
Throughput
~10,000 photos / month
01

The impact

Before — manual After — automated
Time per site
~1.5 hours of manual sorting
2-minute QA review
45×
faster
Photos processed / month
Untracked — done by hand
~10,000 photos, sorted automatically
10k
from zero
Sites processed / month
~75 sites
~90 sites
+20%
capacity
AI classification accuracy
N/A — human eyes only
~93% with two-stage AI
93%
accuracy
Before · human involvement
Full manual sorting
After · human involvement
Optional QA check
02

The brief

The problem10,000 photos, all sorted by hand.

Field crews at telecom sites upload 100–120 photos per site into Telegram topics — dismantled equipment, installed equipment, cables, labels. Someone had to manually download everything, identify what's old vs. new equipment in each photo, sort it into folders, compile a report, and send it off. With ~90 sites a month, that's ~10,000 photos requiring human eyes — and every misclassification meant delays and rework.

The solutionA bot that runs the whole job, end to end.

I built a fully autonomous Telegram bot that monitors field-crew topics 24/7. When engineers start uploading, a smart timer tracks activity and waits for them to finish, resetting with each new message. Once the upload window closes, the bot downloads every photo and message from the topic.

The core is a two-stage AI classification pipeline powered by Google Gemini. Stage 1 reads each photo's equipment labels, matches them against the dismantling and installation lists, and assesses visual condition. Stage 2 hunts for outliers — photos whose classification breaks the series pattern — and re-analyzes them using context from neighboring photos, catching mistakes a single pass would miss.

It then sorts everything into structured folders, generates a detailed equipment report, packages it into a ZIP, and sends it back to Telegram — all without a single click.


03

How it works

01
New message
Activity detected in a UA/UB topic
02
Smart timer
Resets on each new upload
03
Download all
Photos + text from the topic
04 · AI
Stage 1
Classify each photo
05 · AI
Stage 2
Correct outliers using context
06
Sort & package
Folders + README report
07
ZIP & deliver
Sent back to Telegram

04

Key features

Two-stage AI analysis

First pass classifies each photo on its own. A second pass finds outliers that break the series pattern and re-reads them with neighbor context — catching ~7% of misclassifications a single pass misses.

Smart upload timer

Doesn't fire on the first photo. It waits for the crew to finish, resetting the clock on every new message — a configurable 1–60 min window before processing begins.

VDO auto-merging

Detects paired VDO topics automatically, pulls their equipment lists and photos, and merges everything into one unified report alongside the main site.

Crash recovery

If the bot restarts mid-job, it resumes exactly where it left off from SQLite state persistence. No data loss, no duplicates.

Parallel processing

A Producer–Consumer architecture with 2–8 configurable workers, so multiple sites move through the pipeline at the same time.

24/7 monitoring

A heartbeat file keeps Docker healthchecks honest, and periodic Telegram reports surface live metrics — processed, errors, timing.


05

Under the hood

Tech stack
Python 3.11 Google Gemini Flash Telethon SQLite · WAL mode Docker GitHub Actions CI/CD Pillow
Design principles
  • Event-drivenReacts to Telegram messages in real time.
  • Async / thread hybridAsync for Telegram I/O, a thread pool for AI & file work.
  • WAL-mode SQLiteConcurrent reads and writes without locking.
  • Backoff retryHandles API rate limits and timeouts gracefully.
  • LRU cacheTopic titles cached — 500 entries, 1-hour TTL.
work_auto / pipeline
Telegram NewMessage event
Topic timerresets on new uploads
Task queue (FIFO)
Worker pool (2–8 parallel)
Pipeline: Download → AI analysis → Packaging → Delivery
SQLitestate persistence + crash recovery

Got a back office that runs on copy-paste?

This started as the same kind of repetitive, human-eyes-only work. I scope it, build it end to end, and hand back something that runs itself.