Menu
ChevronDown_revised
Sign in
Menu
ChevronDown_revised
Sign in
Engineering

Taking over your cluster through Octant

Finding remote code injection opportunities in Kubernetes developer tools
By Jack Wink

Before we begin, I'd like to give a big thanks to the VMWare Security Response Team and the developers of the Octant project. Security incidents happen, but it's how they're handled that make the difference. We reported the issue on a Thursday night and an official fix was released the next day. While it helped that the fix was simple, considering the size of VMWare and coordination involved, we were impressed with the response time.

Background - Kubernetes GUI Tools

To give some context, about a year ago, we began adopting Kubernetes at Mothership. As part of our move to k8s, we needed to provide our dev team with visibility into the cluster and the deployed workloads. Ideally we would do this in a way that was similar to our current developer experience: a GUI that acts at the developer's permission level.

The natural first choice was the Kubernetes dashboard:

Kubernetes dashboard

There were a lot of things to like:

  • It's officially supported/maintained
  • Deployable via a helm chart
  • Since the 1.7 release, requires a very limited set of permissions to deploy.
  • Operates on the same permission level of the user

But as we got into it, we found there were four different ways we could authenticate to the dashboard:

  • Authorization header
  • Basic Auth (username/password)
  • Bearer Token
  • Kubeconfig file upload

Right away we could rule out basic auth, it's disabled by default and we didn't want to maintain a file with usernames/passwords.

We also had to rule out the Kubeconfig file upload as we use an external identity provider (AWS) to provide access to our clusters. Since we don't embed access tokens into our kubeconfig files, this wouldn't work either.

Bearer tokens were also ruled out pretty quickly — while we could manage service accounts with Terraform, we didn't want to give users the ability to pull secrets out of the cluster, and we worried about users sharing credentials.

The only option left was the authorization header approach. This would mean we'd need some proxy sitting in front of the dashboard that injects an auth token header for the dashboard to consume. We would either have to build our own proxy, or find and evaluate one, both approaches felt like they'd take some time and effort on our part.

Octant

At this point we started looking for alternatives. At the time, the most appealing option we found was Octant.

Octant

Once again, we saw a lot of things we liked:

  • Developers could install it via homebrew or similar package manager
  • Nothing for us to maintain in-cluster to run it
  • Runs at the permission level of the end-user
  • Works with our external IdP!

Since it required no effort on our part, we shared the project with the engineering team and called it a day… Or would have had one of our engineers not said something along the lines of, "and it's much more secure because we're not running this in cluster."

Octant has a different threat model (one that we like better) than the Kubernetes dashboard, but having a different threat model does not inherently make something more secure. Hearing that gave us pause and made us think about how someone could attack us when all of our developers were using Octant.

Web Security

We knew a bit about the Octant architecture. Namely, it was running a local web-server in the background, and launched either an electron app, or browser tab for the web-UI. The local web server would use the kubeconfig file to hit kubernetes, while the web-UI would query the local web server:

Octant architecture diagram

Since Octant wasn't deployed in-cluster, we could ignore in-cluster privilege escalation. We didn't have to grant Octant or our developers any additional permissions through RBAC policies to make it work.

When thinking about attacks on the web-UI, the primary concern would be injection attacks. However, since all the data is loaded from the local API server, which in turn gets its data from the Kubernetes API, an attacker would have to inject something into our cluster or into the binary itself, and hope that Octant didn't sanitize it. Both of those avenues would place the attacker in a privileged position, and so we didn't really explore these attacks.

This left us with the API server. Since the API server gives us access to the Kubernetes API, it's the most sensitive component in the Octant stack. What protections does it offer? How does Octant ensure only the web-UI is using the API?

Well, since this is a browser-based stack, Octant relies on browser-based security controls, namely the same-origin policy. But localhost is somewhat tricky — depending on what URL the user loads, you can see same-site or cross-site request behavior.

For example, in the latest version of Chrome, if you visit http://localhost:5000 and try to fetch http://localhost:5000/api, you'll see the browser treat it as a same-origin request, and everything works fine:

Examples of localhost same-origin requests Same-origin requests

But if http://localhost:5000 requests http://127.0.0.1:5000/api the browser will treat it as a cross-origin request and deny it unless there's an appropriate CORS policy.

Examples of localhost cross-origin requests cross-origin requests

Unfortunately, it's all too common to see local API servers use a wildcard CORS header (Access-Control-Allow-Origin: *) or just reflect the origin header to ensure the API works regardless of what the user enters into the address bar. While this isn't always bad — like if the API has other means of authentication— it's not a one-sized-fits all solution. And worse, it seems many people don't understand how CORS works, and continue to recommend bad practices for working around CORS on localhost.

CORS Check

As a cursory check, we loaded up the documentation on the JS fetch API, popped open the chrome debugger and tried to hit an endpoint on the API server. This particular endpoint is used to fetch logs from a container:

const host = "http://127.0.0.1:7777"
const path = "/api/v1/logs/namespace/kube-system/pod/kube-proxy-zf7s2/container/kube-proxy"
fetch(`${host}${path}`).then((data) => console.log(data))

To our surprise, it went through! We checked the response headers and saw the wildcard:

Access-Control-Allow-Origin: *
Content-Length: 113
Content-Type: text/plain; charset=utf-8

While leaking logs is bad, this particular endpoint still relied on someone guessing the namespace, and a pod/container name. It's not impossible, but also not easy unless you have some kind of inside knowledge. Could we get Octant to tell us what was running?

Aside from the REST API, Octant also has a WebSocket API mounted at /api/v1/stream. The WebSocket seems to be the primary method for interacting with the server. If it had the same CORS policy as the REST API, we could enumerate the clusters and workloads and even run shell commands in the clusters!

We captured a connection and the initial messages being sent, and loaded up a web console again:

let ws = new WebSocket("ws://127.0.0.1:7777/api/v1/stream")
ws.onmessage = function(data) {
	console.log(data)
}
const payload = {
  type: "setContentPath",
  payload: { 
    contentPath: "cluster-overview/namespaces/kube-system",
    params: {},
  }
}
ws.send(JSON.stringify(payload))

Very quickly the console started logging a whole bunch of messages, telling us the request went through successfully. This was enough to know that we had found a remote code execution vulnerability. Since Octant runs at your permission level, and any website could hit the API server, a malicious website could load some JavaScript to launch pods in your cluster, steal secrets, or really anything that you could do through Octant.

Reporting the issue

We believe in giving back to the open source community at Mothership. For normal bugs, we would have just opened a pull request and contributed a fix. But in this case, opening a pull request would disclose a 0-day vulnerability.

Unfortunately, the Octant repository didn't have a security notice or contact information available. We thought about opening a vague issue, but this would also publicly disclose that something had been found, and it would be hard to convey the urgency without attracting attention.

Given that Octant is a VMWare project, we opted to contact the VMWare security response team and hoped they could get in touch with the developers though an internal channel. We found their GPG key and sent them an encrypted report with our findings. Within 24 hours they had reached out to the developers and an official fix had been released.

Lessons Learned

We're looking forward to using Octant at Mothership, it really is a fantastic tool for managing Kubernetes workloads. We are super grateful for the work the Octant developers as it's going to simplify our deployment. And a big kudos to the VMWare Security Response Team and Octant developers for the quick fix.

If there's anything to take away from this post, it's the following:

  • It's always important to consider your threat model, and evaluate the effectiveness of the security controls you rely on.
  • Security is hard and often at odds with flexibility and ease-of-use — it doesn't help that communal information often overlooks security implications.
  • Popular open source projects should provide instructions for reporting security issues.

Are you someone who thinks about developer experience? Do you enjoy working with Kubernetes and Cloud-Native projects? If this kind of work sound interesting to you, check out our job listings and get in touch, we're hiring!

Press inquiries