Thingsboard
Building an IoT Telemetry Pipeline with ThingsBoard: Architecture, Trade-offs, and Lessons Learned
Summary
This article documents my journey implementing an end-to-end IoT data pipeline using ThingsBoard Community Edition (CE), MQTT, and Protocol Buffers (Protobuf). It covers the architectural decisions, the practical steps to make binary telemetry usable, dashboard design principles, security and redaction practices for sharing work publicly, and the lessons I learned moving from a local prototype toward a production-ready approach.
Security note: All values in this post (broker address, topics, tokens, schema, and identifiers) are intentionally anonymized using placeholders like <BROKER_ADDRESS> and <ACCESS_TOKEN>.
Background & Objectives
Our devices publish telemetry over MQTT using Protobuf payloads. Protobuf keeps messages small and efficient, but the binary format is not directly consumable by dashboards or analytics tools. The goal of this work was to:
Receive device data from an existing MQTT broker.
Decode Protobuf payloads into JSON.
Ingest the normalized telemetry into ThingsBoard CE.
Build operator-friendly dashboards for visibility and troubleshooting.
Do all of the above while following sound security and redaction practices so we can document and share our work safely.
High-Level Architecture
Components
Devices → publish telemetry to <BROKER_ADDRESS>:<PORT> on topics like <DEVICE_TOPIC_ROOT>/<DEVICE_ID>.
Decoder Service → subscribes to device topics, decodes Protobuf → JSON, and pushes normalized telemetry to ThingsBoard.
ThingsBoard CE → device registry, timeseries storage, rule engine, and dashboards.
Operators → view dashboards, check health, and analyze history.
Why a Decoder Service?
Decouples transport concerns (broker) from platform concerns (ThingsBoard).
Central place to validate schemas, handle versioning, enrich messages, and implement backoff/retry.
Lets device firmware remain lean while the platform evolves.
Why Protobuf? (And What It Implies)
Pros
Compact binary format → reduced bandwidth and latency.
Strongly typed schemas → validation and forward/backward compatibility.
Well-supported across languages and tooling.
Implications
Human visibility requires decoding; dashboards prefer JSON.
You need clear schema versioning (e.g., message DeviceDataV1 { ... }).
Error handling becomes critical: invalid or out-of-date payloads must fail gracefully.
Practical tip: Introduce a top-level field such as schema_version and validate it in the Decoder Service before processing.
MQTT Design Considerations
Topic Strategy: Use a predictable hierarchy for devices, e.g.
<DEVICE_TOPIC_ROOT>/<DEVICE_ID>/telemetry
<DEVICE_TOPIC_ROOT>/<DEVICE_ID>/attributes
Keep it stable; topic churn leads to fragile consumers.QoS: Choose QoS to match business needs (e.g., QoS 0 for ephemeral metrics; QoS 1 for at-least-once delivery of critical data).
Retain Flags: Use sparingly; good for last-known state, dangerous for high-frequency metrics.
Security: TLS, client auth, and scoped credentials. Keep secrets out of code (env vars/secret stores).
ThingsBoard in the Stack
I used ThingsBoard Community Edition on Docker for local prototyping. It provided:
Device registry & tokens (each device has an <ACCESS_TOKEN>).
Timeseries storage and query APIs.
Rule Engine for routing, filtering, and alerting.
Dashboards to visualize realtime and historical telemetry.
CE was ideal for exploration; we can move the same patterns to PE later for multi-tenant scale, advanced features, and enterprise workflows.
Data Modeling in ThingsBoard
Telemetry: time-series metrics (e.g., temperature, humidity, lqi, battery_pct).
Server-scope attributes: device metadata the platform “owns” (e.g., location, firmware_channel).
Shared attributes: app-managed configuration pushed to devices (e.g., thresholds).
Naming principles
Prefer lower_snake_case and stable keys.
Use units suffixes if ambiguity is possible (e.g., temperature_c, pressure_kpa).
Keep payloads flat for widget compatibility; nest only when necessary.
The Decoder Service: Turning Protobuf into JSON
The Decoder Service subscribes to <DEVICE_TOPIC_ROOT>/+/telemetry, decodes Protobuf using the known schema, validates/version-checks, and transforms it into a stable JSON representation. It then uses the ThingsBoard device MQTT API:
v1/devices/me/telemetry
Key behaviors
Schema guardrails: reject or quarantine messages with unknown versions.
Type safety: convert enums to strings (or consistent integers); normalize timestamps to UTC.
Idempotency: dedupe if the broker re-delivers.
Observability: log decode errors, metricize failure counts, and surface stats (e.g., per-device error rate).
Safe snippet (placeholders):
# Pseudocode — conceptual only
msg = decode_protobuf(payload, schema_version="<EXPECTED_VERSION>")
json_telemetry = normalize(msg) # keys, units, enums normalized
publish_to_thingsboard(
host="<THINGSBOARD_HOST>",
port=1883,
access_token="<ACCESS_TOKEN>",
topic="v1/devices/me/telemetry",
json_payload=json_telemetry
)
Rule Engine: When to Use It
While the Decoder Service handles message translation, ThingsBoard’s Rule Engine shines at:
Routing telemetry to storage, alarms, or external systems.
Enrichment based on attributes (e.g., thresholds).
Triggering actions on state changes (e.g., send notification if lqi drops below a threshold).
Pattern: Keep the Decoder focused on format/validation and the Rule Engine focused on business logic. This separation keeps both sides maintainable.
Dashboard Design Principles
My dashboard goals were clarity and actionability:
At-a-glance status: a KPI strip (online device count, error rate, last data time).
Trend charts: time-series for temperature, humidity, and signal quality (lqi).
Drilldown: device detail view (recent telemetry, attributes, last errors).
Context: embed help tooltips describing what each metric means and the expected range.
Widget tips
Align sampling intervals with device publish rates.
Use consistent time windows (e.g., last 1h, 24h, 7d).
Pick thresholds that match real operating constraints and label them on charts.
Operations & Reliability
Containerization: Run ThingsBoard CE and the Decoder in containers for reproducible environments.
Health checks: Liveness/readiness probes on the Decoder; restart on sustained decode failures.
Backpressure: If the broker bursts, buffer to disk/queue or lower consumption rate gracefully.
Replay: For critical use cases, keep a raw topic or store payloads to re-decode with new schemas.
Versioning: Roll out schema changes incrementally; support dual-read paths during transitions.
Security, Privacy & Redaction (for Sharing Work)
When writing posts, docs, or demos, apply the following redaction policy:
Always replace
Hostnames, IPs, ports → <BROKER_ADDRESS>, <PORT>, <THINGSBOARD_HOST>.
Tokens, credentials, secrets → <ACCESS_TOKEN>, <USERNAME>, <PASSWORD>.
Device IDs, MACs, serials → <DEVICE_ID>, <MAC>, <SERIAL>.
Exact Protobuf schema → show representative examples, not production schemas.
Screenshot hygiene
Blur tenant names, device names, tokens, and unique IDs.
Avoid showing message contents that contain proprietary fields.
Logging hygiene
Don’t print tokens or raw payloads in public logs.
Mask secrets at source (logger filters, structured logging with redaction).
Blog disclosure note
Add a one-line banner in the post:
“All configuration values, topics, tokens, and schemas are placeholders for illustration.”
Performance & Scaling Notes
Protobuf vs JSON: On constrained networks, Protobuf’s size and CPU profile help. The cost is operational complexity in schema management and decoding.
Batching: If devices can batch multiple samples per message, do it; amortize overhead and reduce broker load.
Sampling strategy: Be intentional about publish rates; avoid per-second telemetry if minute-level resolution is sufficient.
Cold paths: If analytics need richer context, export from ThingsBoard to a data warehouse for longer retention and offline analysis.
Anti-Patterns I Avoid Now
Mixing transformation and business logic in device firmware. Keep firmware simple; evolve services and rules instead.
Publishing unbounded key sets. Dynamic keys break dashboards and queries. Standardize keys early.
Hiding schema changes. Always version, document, and announce schema updates.
Relying solely on retained messages for state. Great for last-known, risky for fast-changing telemetry.
What Worked Well
CE first: Experiment freely, then graduate to PE when requirements (tenancy, scale, features) justify it.
A single Decoder choke point: Centralized schema validation and observability.
Consistent naming: Simple, flat, well-documented telemetry keys made dashboards trivial to build.
What I’d Improve Next
Automated schema tests: CI that feeds canned binary payloads through the Decoder to catch regressions.
Alert tuning: Iteratively adjust thresholds to reduce noise and highlight true anomalies.
Fleet metadata: Use attributes to encode hardware revision, firmware, and install location for richer context.
Conclusion
ThingsBoard CE, combined with MQTT and Protobuf, gave me a flexible foundation to receive, normalize, store, and visualize IoT telemetry. The core insight was to separate responsibilities:
Devices focus on publishing efficient payloads.
The Decoder focuses on format translation and validation.
ThingsBoard focuses on storage, rules, and visualization.
With clear schemas, a stable topic strategy, and disciplined redaction, this approach scales from a laptop prototype to an on-premises deployment—without exposing sensitive details.
What’s Next?
Stay tuned as I dive deeper into IIoT, cloud, and AI integrations to bring more impactful solutions to life.
Comments
Post a Comment