From 1.5 hours of manual sorting to a 2-minute review — fully automated with AI.
A fully autonomous Telegram bot that watches telecom field-crew topics 24/7, sorts thousands of equipment photos with a two-stage Gemini pipeline, and ships a finished report back — without a single click.
The impact
The brief
The problem10,000 photos, all sorted by hand.
Field crews at telecom sites upload 100–120 photos per site into Telegram topics — dismantled equipment, installed equipment, cables, labels. Someone had to manually download everything, identify what's old vs. new equipment in each photo, sort it into folders, compile a report, and send it off. With ~90 sites a month, that's ~10,000 photos requiring human eyes — and every misclassification meant delays and rework.
The solutionA bot that runs the whole job, end to end.
I built a fully autonomous Telegram bot that monitors field-crew topics 24/7. When engineers start uploading, a smart timer tracks activity and waits for them to finish, resetting with each new message. Once the upload window closes, the bot downloads every photo and message from the topic.
The core is a two-stage AI classification pipeline powered by Google Gemini. Stage 1 reads each photo's equipment labels, matches them against the dismantling and installation lists, and assesses visual condition. Stage 2 hunts for outliers — photos whose classification breaks the series pattern — and re-analyzes them using context from neighboring photos, catching mistakes a single pass would miss.
It then sorts everything into structured folders, generates a detailed equipment report, packages it into a ZIP, and sends it back to Telegram — all without a single click.
How it works
Key features
Two-stage AI analysis
First pass classifies each photo on its own. A second pass finds outliers that break the series pattern and re-reads them with neighbor context — catching ~7% of misclassifications a single pass misses.
Smart upload timer
Doesn't fire on the first photo. It waits for the crew to finish, resetting the clock on every new message — a configurable 1–60 min window before processing begins.
VDO auto-merging
Detects paired VDO topics automatically, pulls their equipment lists and photos, and merges everything into one unified report alongside the main site.
Crash recovery
If the bot restarts mid-job, it resumes exactly where it left off from SQLite state persistence. No data loss, no duplicates.
Parallel processing
A Producer–Consumer architecture with 2–8 configurable workers, so multiple sites move through the pipeline at the same time.
24/7 monitoring
A heartbeat file keeps Docker healthchecks honest, and periodic Telegram reports surface live metrics — processed, errors, timing.
Under the hood
- Event-drivenReacts to Telegram messages in real time.
- Async / thread hybridAsync for Telegram I/O, a thread pool for AI & file work.
- WAL-mode SQLiteConcurrent reads and writes without locking.
- Backoff retryHandles API rate limits and timeouts gracefully.
- LRU cacheTopic titles cached — 500 entries, 1-hour TTL.
Got a back office that runs on copy-paste?
This started as the same kind of repetitive, human-eyes-only work. I scope it, build it end to end, and hand back something that runs itself.