Expert Guidance on How to Choose The Right MQTT Broker

10 min read

The internet of things (IoT) runs on billions of connected devices, opening endless opportunities for your business to innovate and improve. But it all falls apart without a robust communication system. MQTT (message queuing telemetry transport) brokers quietly do the heavy lifting, making sure your devices, applications, and cloud services can easily communicate.

From smart cities and industrial automation to public safety and healthcare, MQTT brokers connect IoT infrastructure with cloud platforms. They make it easy to securely send data, run analytics, and make real-time decisions.

However, choosing the wrong broker can unravel even your best-laid IoT plans. Issues like poor scalability, hidden integration problems, rising costs, and downtime could derail your efforts — or even force you to abandon your strategy entirely.

This isn’t just hypothetical — we’ve seen it happen. Companies often run into problems because they underestimate the challenges.

Most ecosystems have unique quirks, workarounds, and design constraints that complicate adoption. If you’ve looked at popular brokers such as EMQX, Mosquitto, VerneMQ, or HiveMQ, you’ve likely seen the usual comparisons: feature lists, isolated benchmarks, and basic functionality overviews. However, what’s missing is the real-world insights on how to retrofit these brokers into your older systems. This article shares nuances, giving architects and developers the context they’ve been missing.

5 FACTORS THAT SEPARATE SUCCESS FROM SETBACKS

1. Authorization mechanism and access control

Have you considered how your broker’s permission model might affect scalability later? While most brokers support multiple authentication and authorization models, including certificate-based authentication, their capabilities differ significantly. Important reminders:

Replacing a cloud provider’s MQTT broker with a third-party one is a challenge. Most cloud brokers generate client certificates, but most third-party brokers don’t. This means you’ll need to find a PKI (public key infrastructure) provider or build your own, even a simple version. While clients can technically use self-signed certificates, doing so often weakens security. Even then, this approach still requires your engineering team to handle a migration effort.
If you’re using shared client credentials, where multiple clients share the same certificate, make sure your broker can map certificate fields to usernames. You’ll still need unique client IDs, but it’s wise to create a naming system. This will help you quickly identify which client, among hundreds sharing the certificate, has issues.
Not all brokers let you assign granular, client-specific permissions. Some make it worse by limiting how many permission groups you can create. This makes it impossible to follow the principle of least privilege.
How brokers communicate authentication and authorization failures to clients matters. Does it return explicit error codes, or silently reject authorization attempts? The latter might sharpen your engineers’ debugging skills but will drastically increase your time-to-resolution, even for minor issues.

The gist: You should outline the current or desired authentication and authorization flow. Validate its feasibility with the candidate broker. Don't rely on limited tests to estimate migration efforts. Check compatibility with the clients intended for production, such as Litmus, Ignition, or a basic MQTT client.

2. Native vs. adapted MQTT brokers

Are you confident your broker fully adheres to the MQTT standards critical to your operations? Not all MQTT brokers are built for your needs. Some, like ActiveMQ, NATS, and RabbitMQ, only add MQTT support through plugins, connectors, or configuration adjustments. But that doesn’t mean they match the performance or protocol compliance you’d get from a purpose-built MQTT broker. For example:

NATS ignores MQTT’s clean session flags and persists in all client sessions. This significantly impacts broker performance. Retained messages in clustering mode are delivered at best-effort. Only QoS 0 guarantees are supported, which limits reliability.
ActiveMQ (including AmazonMQ for ActiveMQ) limits you to 250 unique users and needs a broker restart for most configuration changes. This reduces flexibility.
RabbitMQ supports QoS levels 0 and 1. However, it does not support shared subscriptions. This is a major limitation in large-scale deployments, especially when you integrate with data processing pipelines.

The gist: Just because something is supported doesn’t mean it’s fully compatible. Start by figuring out your key drivers for migrating or adopting and use those to evaluate your options before making a decision.

3. Connection ramp-up and scalability

Does your current broker’s authentication mechanism meet your environment’s security demands during load spikes? Beyond raw connection and throughput, your MQTT broker must quickly establish connections and process messages during high-load events. Imagine an IoT system managing hundreds of thousands of connected devices. If a cluster node upgrade causes temporary disconnections, your broker needs to reconnect quickly to keep services running smoothly. Potential challenges to remember:

Misconfigured network infrastructure or listeners can slow your broker down. Even if it's built to handle hundreds of thousands of clients simultaneously, it might only onboard a few hundred per second.
If your authentication mechanisms rely on a third-party service via webhook for permissions, it will slow down the client connections. This happens if the external service lacks capacity, hits API limits, or faces bandwidth issues.
Managed services often set rate limits for publishing and consuming due to resource sharing or optimization. If you exceed these limits, it will disrupt your service or force users (MQTT clients) to constantly mitigate and stay within the boundaries.
Limits exist everywhere — to preserve resources or guarantee reliability and resiliency. Features like retained messages need persistent storage and depend on the database engine. These limits come from how much data it can store, how well it performs with many messages, or both. The same goes for in-memory buffers and the size of routing tables. When vendors claim “no limits,” it usually means “high enough for common use patterns,” as their support team may explain when issues arise.

The gist: There’s no such thing as infinite computing power or perfect algorithms, so no software is without limits. Find these limits before going to production, but always test at scale. You won’t spot most bottlenecks without load testing, and some will only become clear in hindsight.

4. Broker performance alone isn’t enough

Have you explored how poor topic design could affect costs and performance in your setup? A high-performance MQTT broker is meaningless if your client can't handle its capabilities. Client-side constraints, whether due to design, resource limitations, or poor implementation, will slow things down. Here's what to consider:

If your client can't keep up with high message volumes, you'll either lose messages or force the broker to drop them. Brokers use internal buffers to store messages waiting for delivery. Depending on buffer size, they can handle short spikes, but sustained overloads will exhaust them, causing data loss. No amount of buffering will save you from slow processing performance. You should consider distributing the load by scaling your consumer groups or optimizing your topic design to reduce redundant data delivery.
Avoid root-level wildcard subscriptions. Many commercial brokers limit transactions per second (TPS), and careless topic design will double costs and strain the system.
Don't skip logging and session tracing. Without them, troubleshooting gets complicated. Make sure both your broker and clients offer enough visibility to quickly diagnose issues.

The gist: It's a known truth that a system is only as strong as its weakest link. This applies here too. At some point, no broker architecture can fix all the inefficiencies on the client side. Focus on removing bottlenecks in your system first. Don’t rely on migrating to a new broker to cover them up.

5. Migration

Are you prepared for the nuances of migrating configurations and permissions? Starting with a specific MQTT broker allows you to tailor the setup and integration right from the beginning. However, switching to a different broker is more complicated than just updating the server's hostname and certificates. Here are a few challenges you should expect:

Migrating client identifiers and permissions from one MQTT broker to another isn't straightforward. You might find tools to import a client registry but none to export, unless you write your own scripts. Both brokers also need to have the same level of permission granularity. If they don’t, you’ll need to add a conversion step to ensure compatibility.
Modern IoT ecosystems often involve multiple stages of data processing. Your data may pass through third-party services like Kinesis, Azure Blob storage, or databases. To handle this, your new broker needs built-in support or plugins for these integrations. Otherwise, you'll need to design and build custom integration services. Even if integrations are available, you must evaluate their functionality. For example, S3 integration might not support message batching. Or it might lack important features like SparkPlug B message decoding. If these are missing, you may need to develop a custom solution.
When you use third-party services as data destinations, IoT rules or forwarding rules help define data routing, transformations, and pre-conditions. Most IoT rules rely on SQL-like syntax, but the exact version and supported features differ across MQTT brokers. If you're lucky, a translator exists between two specific dialects. Otherwise, you’ll need to create one to translate the current IoT rules, adding to your migration workload. Even with a translator, further testing is needed to ensure everything works as expected.

The gist: Migration isn’t just about adding a new broker and updating client settings. You’ll need to transfer existing configurations, provisioning pipelines, and data flows. This may mean using built-in migration tools or custom implementation. Take time to examine your current setup. Plan any adaptations or restructuring before migration to ensure everything is aligned with the new broker’s capabilities and requirements.

Checklist for selecting and integrating MQTT brokers:

Look beyond raw performance

Think about real-world factors like client constraints, intricate data flows, and differences in how standard features work

Remember, all systems have limitations due to resource constraints, architectural trade-offs, and managed service policies

Focus first on your current or desired data flows

Next, choose a broker whose features and limitations align seamlessly with your needs

Every decision in your IoT architecture affects your ability to grow, adapt, and succeed. Focus on the right priorities, spot potential issues, and use insights like these to create an IoT strategy with lasting impact.

Need help evaluating the best-fit MQTT broker for your IoT strategy? Contact us today