Why the Service Mesh Should Fade Out of Sight

gumby · on Jan 18, 2021

> Although primitive by OS standards, [k8s] will make legacy OSes like Linux and Windows more and more irrelevant as it matures.

A bit of a stretch; k8s has a long way to go before it fulfils the requirements of an OS. But it's an interesting way to think of it.

m463 · on Jan 18, 2021

I was taught in school: "the job of an operating system is to share resources"

But in the spectrum of things, k8s is much more than a clever hack, but not quite a designed operating system.

reilly3000 · on Jan 18, 2021

How about Orchestration System?

bluepizza · on Jan 18, 2021

So... Is kubernetes running on bare metal with its own kernel?

gumby · on Jan 18, 2021

Well, NT runs on top of a HAL so could be considered not running on bare metal. And some systems have supervisory code running in a coprocessor. At the other end, a modern OS, like the mainframe OSes of yore, did IO by talking to coprocessors that did the heavy lifting. In the old days we called them "channel controllers" but now we just say that the USB chipset handles the details with the OS just orchestrating.

But I agree with you, as I hope my comment was clear: while it's an interesting metaphor, K8s is a long way away from being usefully thought of as an OS. And if it ever does reach a point where such a term would be useful it will be unrecognizable when compared to where it is today.

yazaddaruvala · on Jan 18, 2021

My guess would be they mean running a specialized unikernel.

Keep in mind, any such kernel would have very simple requirements. k8s typically runs on a hypervisor not bare metal, taking care of security, and bare metal compatibility. k8s manages the sharing of resources. The file system typically needs to be ephemeral rather than persisted. IPC all happens through explicit APIs.

What else does a kernel even do?

Spivak · on Jan 18, 2021

Implement all that functionality? k8s is very powerful glue but it farms our all it’s actual work to things that very need an OS.

yazaddaruvala · on Jan 19, 2021

> Implement all that functionality?

You must have miss-read my comment or I did a poor job writing, sorry.

I was saying, to become an OS, “there is very little OS functionality that k8s would need implement” most of the functionality is already handled by other parts of the stack, e.g. hypervisor, networked storage like S3, and user built APIs.

hn_asker · on Jan 18, 2021

I agree that Service Meshes should be easier to use and configure both statically and dynamically. The cost of running them is also a con. Those extra sidecars make your pod heavier and slower to start.

I disagree that they should fade out of sight. This sentiment is to a larger extent applicable to reverse proxies too. My experience debugging problems when the service mesh is not accounted for can be painful.

For example, setting keepalive parameters and max connection age parameters for gRPC clients and servers. If the intermediary proxies and sidecar components are not accounted for, connections may close sooner than configured on the server but still be open on the client. Similar permutations of this abound.

Another example is retry logic. If retry mechanisms are configured between client and server, there can be a multiplicative blowup of connections established/dropped to/from servers due to intermediate proxies retrying.

I think hiding the service mesh will end up with complications similar to "dll hell".

indymike · on Jan 18, 2021

This article hits a new low for Forrester. This article:

a) does not show any understanding of how software works.

b) makes little sense, given what a linker actually does.

c) shows little to no understanding of how an individual computer works, let alone a service mesh.

tflinton · on Jan 18, 2021

Service meshes shouldn’t by-in-large require any sort of dev-specific knowledge to participate in.

My experience is limited to istio but tagging a deployment is more or less all an engineer needs to know. After that a sidecar is attached that rewrites ip traffic and gives you better observability, mutual TLS encrypted transports and obviously http routing / traffic control.

Obviously to take advantage of a lot of policy capabilities an engineer would need to know or declare their dependencies but I hardly think this is something that has a cognitive overhead that the author seems to imply.

jayd16 · on Jan 18, 2021

Agreed, maybe I'm not understanding something about how service meshes are run but shouldn't they be transparent to the app?

bincyber · on Jan 18, 2021

Transparent to the app while providing all of the features for you so you dont need to reimplement/reinvent the wheel for each service.

LASR · on Jan 18, 2021

The service mesh is an abstraction layer. It hides the complexity of interconnecting multiple services. But the mesh itself is not intended to be transparent.

Just like an OS hides the complexity of managing hardware, but software written is very much aware of the OS and depends on it’s specific interfaces to operate.

williamallthing · on Jan 18, 2021

> Service Mesh == Dynamic Linker For Cloud

TMYK: Linkerd was actually named for the dynamic linker ("linker daemon"). It's also the service mesh most focused on transparency--in the vast majority of cases you can add Linkerd to an existing application without config and the application will continue functioning. https://linkerd.io/2/design-principles/

bobthebuilders · on Jan 18, 2021

> Contrast this to the real experience of linking to a library: You reference the library from your IDE, build, and deploy. Done. That should be the gold standard for service mesh.

Uhhhh. If linking software is the gold standard for a service mesh, I think we're already there. If you're using the right tools and stack it's fine, but if you mess around and go lower-level you're in for a treat.

mikesabbagh · on Jan 18, 2021

Agree service meshes today need extensive configuration to make it production ready. My opinion is that service meshes are still not mature enough. Many products are built as a layer on top of istio (check solo.io or ambassador, or knative), but are still complex to manage, and require some basic envoy proxy knowledge. Will they be integrated with kubernetes and fade in the background? definitely not, not even close.

williamallthing · on Jan 18, 2021

If all your experience is with Istio and "layers on top of Istio", I'm not surprised you think service meshes need lots of config. Sounds like you need to try Linkerd!

jacques_chester · on Jan 19, 2021

A nitpick: Knative does not require Istio. It only needs an Ingress. There are multiple alternatives, of which Istio can provide one.

still_grokking · on Jan 18, 2021

Oh, someone supports what I'm saying constantly since some time: Kubernetes is the Windows of the cloud age. At least it tries hard to get there, but seems largely successful on this route by now. Of course the whole "cloud native" Google monoculture will become the same burden as Windows is because the same vendor lock-in mechanics apply here. MS also didn't mind in the past which kind of PC vendor (~ cloud provider) you buy form as long as your were running Windows (~ Kubernetes) on that box.

gneray · on Jan 18, 2021

How could the writer make the argument he made and not talk about dependency management?

NB I'm surprised that anything from Forrester made it on here but kudos to them.

LASR · on Jan 18, 2021

+1. You can’t effectively hide away a mesh unless if it also handled versioning and service dependencies automatically.

I’ve seen several codebases now where the company decided to move to a full SOA-architecture without much effort put into maintaining stable, public interfaces at every service.

The goals defined sometimes are quite misguided, eg: “We’ve moved 80% of our services over to k8s”.

Okay. Goal achieved. Then I see engineers face issues over coordinating changes across several services. It’s very easy to fall into the trap of solving the problem of making sure coupled service deployments all happen at the same time. Rather than defining public interfaces and making the services be able to handle several versions.

domk · on Jan 18, 2021

> legacy OSes like Linux and Windows

Since when did the most advanced and widely used OSes in the world become legacy?

> legacy Ones ... more and more irrelevant as it matures

And what will Kubernetes run on, bare metal? Thin air?

cproctor · on Jan 18, 2021

My favorite passive-aggressive HN rhetorical move. (edit for clarity: I'm referring to dragging things as "legacy" as passive-aggressive, not the parent comment.)

saurik · on Jan 18, 2021

I learned that from Microsoft: all of their competitors products were, from their perspective, a legacy system off of which you were almost-by-definition migrating to a Microsoft replacement (and if you weren't, you de facto weren't their customer).

goblin89 · on Jan 18, 2021

The difference between the connotations of the word “legacy” in general and inside the tech industry is worth a ponder.

Fundamentally, legacy is something left to us by someone else. On its own it’s neutral: if it is a result of someone’s effort, legacy is usually good; if it is a result of someone’s mismanagement, legacy is bad.

Curiously enough, in tech the mere state of being left by someone always implies something undesirable, obsolete, to be replaced. No matter how hard someone worked on it, where there’s legacy there is a tinge of exasperation. The state of being legacy is more or less an opposite to being current; a piece of software cannot be both.

fennecfoxen · on Jan 18, 2021

While the wording in the article is over the top, I think it’s getting at a concept of a near stateless web-app unit that runs with some RAM, some sockets, stdout/stderr for logging, and a static set of files. In that sense the app needs only a few of the other parts of the OS, benefits from minimalism, while Kubernetes becomes much more interesting to the web app developer, hosting copies of this in some distributed manner across a cluster.

Certainly some parts of the world are trying to make this the future anyway, though there’s a lot of room for misgivings (being a distributed system is hell)

arrow7000 · on Jan 18, 2021

I thought the same thing. Doesn't make sense to call something legacy if the 'new' thing you're talking about doesn't replace it.

crdotson · on Jan 18, 2021

In some ways it sort of does though? The worker nodes can be (and increasingly are) a locked down “cattle” appliances, like CoreOS. Yes, it’s running the Linux kernel so you’re absolutely technically correct to claim it’s Linux, but it’s a far cry from a “pet” RHEL/Debian/etc. system.

blodorn · on Jan 18, 2021

Your RHEL/Debian/etc hosts can be just as much locked down cattle as your CoreOS (which is also Linux). I find it very interesting to go on about the technicality of correctness when the entirety of kubernetes is dependent upon Linux.

joana035 · on Jan 18, 2021

This cattle vs pet analogy is harmful.

I found many people using it for the sake of sounding articulated, but not even in farms cattles are as disposable as we like to use it to refer to computers, distributions or whatever.

lumost · on Jan 18, 2021

Most of the tooling that comes with linux isn't very friendly to learn, and while more transparent companies grant root access for developers, many still resist allowing ssh/sudo privileges on servers. The standard linux daemons that used to be the default logging (syslog), scheduling (cron), and deployment (apt/yum) aren't commonly used in modern cloud shops.

Too many developers, linux and other OS primitives are unneeded overhead on their application. One can write a little cloud formation, or kubernetes yaml and be off to the races while only interacting with linux as part of their build system.

joana035 · on Jan 18, 2021

Most of the tooling that comes with a kitchen isn't very friendly to learn, and while more transparent companies grant chef access for assistants, many still resist allowing chef privileges on the kitchen. The standard oven that used to be the default cooking, baking, and frying aren't commonly used in modern kitchen shops.

Too many people, knifes and other tools are unneeded overhead on their application. One can cook a little cake, or pasta and be off to the races while only interacting with kitchen as part of their build system.

quadrifoliate · on Jan 18, 2021

I think you can develop this analogy further; why stop? Since the kitchen usually has neither the time or the money to have a bunch of trained chefs cook their meals, they just order frozen cake and pasta from the same retailer that everyone else uses without particularly checking the quality. They also copy the menus and prices for what to serve from the internet to save time.

If the customer has a bad experience with a specific plate of pasta or piece of cake, they just say “Sorry!“ and replace it with a new one automatically. Most of the times the people coming to the restaurant are not particularly discerning anyway and just want something to satiate their hunger, so they are fine with this arrangement. New packaged flavors are rolled out periodically by the frozen food company and so the restaurant can even boast a seasonal menu!

NewJazz · on Jan 18, 2021

syslog/cron/apt are overhead? They are ants compared to even the most stripped down k8s overhead.

lumost · on Jan 18, 2021

If one isn't using the tools available from the OS or even the OS regularly, it's tough to know that they are there. I picked on cron/syslog/apt as there aren't too many times I see them used on server fleets these days. That's coming from 10 years working in both SRE/systems and Software Eng roles on fleets of 1k+ servers both in and out of clouds. It's really diffiult to convince new engineers that even having a persistent process is a good idea vs. using an orchestration service.

As for the syslog/cron/apt examples I'll share some more detail below.

- Syslog Isn't used as it's pretty straightforward to pull in ones favorite language specific log library, combined with ones favorite log tool of choice and be off to the races.

- Cron doesn't get used as it can only easily handle scheduling on a specific host. Whereas there are a plethora of distributed schedulers which will happily run commands on any number of hosts.

- Apt/yum don't get used as folks want to have their builds to be repeatable, and rely on software built in via their dependency management system of choice and deployed via fat statically linked binary.

NewJazz · on Jan 18, 2021

> Whereas there are a plethora of distributed schedulers which will happily run commands on any number of hosts.

If you have any specific suggestions, I am on the lookout. I checked out tools like airflow and dask but did not see what they added. Trying not to go full blown HPC cluster.

kitotik · on Jan 18, 2021

HashiCorp Nomad[0] can do scheduled periodic tasks across a cluster of machines. Much lighter weight than Kubernetes, but a shotgun flyswatter when compared to cron.

[0] https://nomadproject.io

lumost · on Jan 19, 2021

If you're on a PaaS solution, I'd take a look at the built-in scheduler e.g. scheduled tasks in Kubernetes, fargate or scheduled tasks in Mesos via Singularity or other frontend.

If you aren't using a PaaS, take a look at hashicorp's nomad tool. It's worth noting that if your team uses microservices or just likes making lots of services, a self-hosted or cloud PaaS solution will save you inordinate amounts of time when it's used by more than 20 engineers.

wmf · on Jan 18, 2021

Cognitive overhead perhaps. I hope the new simplified platform is actually good and not an ad hoc, informally-specified, bug-ridden, slow implementation of half of Unix, though.

chupchap · on Jan 18, 2021

I've often seen people mistakenly use the words legacy and established interchangeably.

m463 · on Jan 18, 2021

In this circumstance, I think "bare metal" is probably the idea they're trying to get across.

vvanders · on Jan 18, 2021

> What if it also required understanding the internal architecture of the operating system’s dynamic linker to diagnose runtime problems? I hear you responding, “That’d be insane!”

Uh, I hate to be glib but has the author ever built something of consequence?

At least once a week I see some team running into this stuff, if you don't understand it then you're stuck twiddling your thumbs as soon as you hit any problem with your classloader or linker.

Analogy aside service connections seem a heck of a lot more complicated than virtual address spaces. I just really have a hard time buying that you are going to make all of the leaky parts of that abstraction disappear.

mirekrusin · on Jan 18, 2021

Exactly, expanding on this analogy monolith would be statically linked binary... just like go does it.