Cloudy with a chance of meatballs
Recently I have setup test environments on both the IBM MQ and Confluent Kafka cloud offerings. For both of these the setup was quick and simple.
Since many applications today are using client connections, switching from a locally hosted service to a cloud offering was as simple as changing the connection URL. From first connect to actively using, it was under an hour total. The most time was spent setting up security but even that was pretty fast.
You can certainly see the attraction of services like this, compared to what it takes to get a secure environment in place on your own. Most of the setup steps were automated and I didn’t need to worry about allocating hardware, operating systems, and all of the related configuration. Of course, not that any single piece of that is complex but it’s the entire process that takes the time. In my case, I didn’t need to involve anyone else to get this done.
Once up and running, I don’t even have to worry about operational details such as “making sure it is running”, or “managing the application load or failover”. The service provider managed all of this and I just needed to focus on using it. Another great time and resource saving.
I know that it sounds so perfect, and it seems like the way to go; but there are a few issues.
Firstly, it creates some of the same problems we had with distributed systems in earlier days. While I have visibility into what I have setup, no one else does. That is, if my application is part of a larger application environment, what happens when the system I have created has problems? How is it tied into the overall view that the corporation has of the environment?
Another related concern is that these systems bill based on usage. So, what happens if my application logic has an issue. If you get into a message processing loop due to a bad message, how do you know this before it racks up a large bill?
As I noted, in many cases, you will need to tie into existing applications. These applications may not be configured optimally to match the new environment. It is possible they weren’t written efficiently to begin with, which local systems may have tolerated. Now that they are leveraging cloud services, small latency issues may impact overall performance. How do you find these?
And finally, while the key infrastructure pieces are created, you still need to create the applications to use it and are responsible for setting up the artifacts in them. While again, these systems come with integrated tooling, this tooling is vendor specific.
While these cloud offerings are attractive, they still come with a significant (if hidden) cost, and careful planning is required to be able to provide a reliable, secure and high performing service.