0:00
For the last decade, the default infrastructure
0:02
answer has basically been cloud first, Kubernetes
0:05
probably, and then good luck to whoever has to
0:09
operate the thing after the architecture diagram
0:11
gets turned into real life. And to be fair, the
0:15
cloud solved a lot of real problems. Nobody wants
0:18
to go back to waiting on hardware, filing tickets
0:20
for VLANs, or discovering that the one person
0:24
who understood the SAN is on vacation. But there
0:27
is another side of this now. Cloud bills are
0:30
getting harder to explain. Kubernetes platforms
0:32
are getting harder to maintain. Data sovereignty
0:36
is becoming a real business requirement. And
0:39
a lot of teams are quietly looking at all of
0:41
the layers that they have built and asking a
0:44
pretty reasonable question. Is this actually
0:47
simpler? because sometimes the thing that was
0:49
supposed to reduce operational burden just moves
0:52
the complexity somewhere else into Terraform
0:55
into Helm charts into networking into managed
0:59
service glue into a platform team that is now
1:02
responsible for making 15 different abstractions
1:05
feel like one coherent system and that is really
1:09
what this conversation is about not cloud is
1:11
bad not bare metal is back everybody grab a screwdriver
1:15
more like What happens when teams want more control,
1:18
more predictable cost, better performance, and
1:22
less platform sprawl without going back to the
1:25
old-school pain of managing infrastructure by
1:27
hand? I'm Brian Teller from Teller's Tech, and
1:30
this is Ship It Weekly. Welcome back to Ship
1:49
It Weekly, where I filter the noise and focus
1:52
on what actually matters when you are the one
1:54
running infrastructure and owning reliability.
1:57
Most weeks, it's a quick news recap. In between
2:00
those, I do conversation episodes with people
2:03
who are building platforms, running infrastructure,
2:06
organizing events, and thinking through where
2:09
this industry is actually headed. Today is one
2:12
of those conversations. I'm joined by Jake Warner,
2:15
founder and CEO of Cycle.io. Cycle is an infrastructure
2:19
platform that lets teams run containers and virtual
2:22
machines across bare metal, cloud, private cloud,
2:26
and hybrid environments without trying to turn
2:29
every company into a full-time platform engineering
2:32
shop. And I like this conversation because it
2:35
gets into a topic that a lot of people are thinking
2:37
about right now, even if they are not always
2:40
saying it out loud. A lot of teams are tired,
2:43
not lazy, not anti-cloud, not anti-Kubernetes,
2:47
just tired of the amount of complexity that has
2:50
piled up around modern infrastructure. You start
2:52
with a simple goal. Run the application. Scale
2:55
it. Deploy safely. Keep it reliable. Don't spend
2:59
a fortune. Then suddenly, you have Kubernetes
3:02
clusters, node groups, disruption budgets, autoscalers,
3:06
Terraform modules, managed services, IAM policies,
3:10
GitOps controllers, observability agents, service
3:14
meshes, secrets systems, and a Slack channel where
3:18
somebody asks why the platform is blocking delivery.
3:21
And somewhere in the middle of that, people start
3:23
wondering whether there is another way to think
3:26
about infrastructure. In this conversation, Jake
3:29
and I talk about why some teams are moving back
3:32
toward private cloud and bare metal, but not
3:35
in the nostalgic racking and stacking servers
3:38
was awesome kind of way. More because of cost,
3:41
performance, data sovereignty. and wanting more
3:45
ownership over the stack. We also get into what
3:48
people still misunderstand about bare metal,
3:50
why some teams want VMs and containers living
3:53
together, where Kubernetes is still the right
3:55
answer, and where an opinionated platform might
3:59
be a better fit than giving every team every
4:01
possible knob to turn. There's also a good thread
4:05
in here around failover versus active-active
4:07
systems, stateful workloads, why application
4:11
-level replication often beats platform-level
4:13
magic, and what it really means to make raw infrastructure
4:17
feel like a cloud-like resource. And towards
4:20
the end, we talk a bit about AI workloads, GPUs,
4:23
hype cycles, and why the most bleeding-edge
4:26
teams are not always the same teams that want
4:29
an opinionated platform. So if you work around
4:32
DevOps, SRE, platform engineering, cloud infrastructure,
4:37
Kubernetes, private cloud, or you are just starting
4:40
to wonder whether your modern platform has become
4:42
a very expensive junk drawer, this one should
4:45
be worth your time. All right, let's jump in.
4:53
Today, I'm joined by Jake Warner. He's CEO and
4:56
founder of Cycle.io. We're going to be talking
4:59
about private cloud, bare metal, and why a lot
5:02
of teams are quietly exhausted by platform complexity.
5:06
Jake, thanks for joining me. Thanks for having
5:08
me. Give me your thesis. Why are teams pulling
5:10
back toward private cloud and bare metal again?
5:13
So, you know, for most of the people watching
5:16
this podcast, we all knew that bare metal was
5:17
really sexy a decade ago, right? 15 years ago.
5:21
The cloud made it super easy to move away from
5:24
bare metal. It solved the big complexity that
5:27
came into, you know, I mean. We all know how
5:31
hard it was to get a bare metal server online,
5:33
get it configured, do it at scale. In the early
5:35
days of Cycle, I'd always say, you know,
5:36
anyone can deploy one or two servers. But when
5:39
you need to start automating 100 servers, that's
5:41
when the problems get really complex, right?
5:43
I'm guessing most people listening to this podcast
5:46
are completely aware of that. With what we built
5:49
with Cycle, our goal was to simplify that process
5:53
so that way you could have companies that were
5:55
able to own their bare metal, et cetera, and
5:58
still provision it like a cloud-like resource.
6:00
And the reason why we kind of did that and why
6:03
we see companies coming back to bare metal is,
6:05
number one, cost. Hyperscalers ended up, you
6:08
know, they made it really easy. But we've all
6:10
seen hyperscalers just continue to increase costs.
6:12
And once you get locked into that ecosystem,
6:14
your costs kind of only go up. So that's number
6:16
one. Number two is about data sovereignty concerns.
6:19
I think especially with a lot of the geopolitical
6:21
issues that we're seeing in the world right now,
6:23
most of the companies that we've been working
6:24
with are companies that are saying whether it's
6:26
compliance reasons or they have customer demands,
6:29
maybe is the right term for it. We have companies
6:32
that are coming to us and saying. we cannot be
6:34
on a US-owned hyperscaler. Even if geographically
6:38
that infrastructure is sitting over in Europe,
6:40
we need to, because of the US Cloud Act, we have
6:43
companies that are saying we cannot be on US
6:45
-owned infrastructure. So I think that in terms
6:48
of private cloud, in terms of bare metal, it
6:49
all kind of comes back to number one, cost, number
6:51
two, data sovereignty and compliance. And then
6:54
the third item there would be performance. And
6:56
for a long time, I think people kind of... gave
6:59
up the performance that bare metal would give
7:01
because of the ease of the cloud outweighed the
7:04
performance you got from bare metal. But fast
7:06
forward to today where you have platforms like
7:08
what we've built that can make bare metal a cloud
7:12
-like experience on top of bare metal. It allows
7:14
those organizations to go back toward performance,
7:17
lower their costs and own more of their stack
7:20
in the process. So along those same lines, is
7:23
there like a common misconception you see that
7:26
people have regarding? bare metal or on-prem
7:30
in general? Oh, absolutely. So, you know, as
7:34
recently as, I mean, it happens all the time,
7:37
but we were doing a demo back in November. And
7:40
this was with a large company over in Denmark.
7:43
And they were one of the early companies to adopt
7:45
AWS. And this company, they were looking at getting
7:49
away from Kubernetes. They were using EKS. And
7:52
the main reason they were coming to us was data
7:55
sovereignty issues. They had customers that were
7:57
saying that they needed to not be on AWS to be
8:00
able to continue to be customers of that company.
8:02
And during that demo, one of the requirements,
8:04
because I kind of posed the question, like, are
8:06
you interested in just getting off AWS? Are you
8:07
also interested in adopting bare metal as part
8:10
of this. And the company was like, no, no, no,
8:11
we don't want to talk bare metal. Like we don't,
8:13
we don't want to be responsible for maintaining
8:14
hardware. We don't want to be responsible for
8:16
setting up all the networks and all of those
8:18
things. And you could tell, again, very technical
8:20
team, but they had spent so much of the last
8:23
decade plus in the cloud that they were just
8:26
kind of used to that. And we did the demo and
8:28
three quarters of the way through, I was like,
8:30
you know, hey, we're talking about all these
8:32
DevOps terms. We're talking about containers,
8:33
infrastructure, et cetera. What if I do the rest
8:35
of this demo and do it on bare metal? but not
8:38
tell them until afterwards. And it was kind of
8:40
like a game to me. Like I had someone kind of
8:42
challenge my thoughts on, you know, the ease
8:44
of bare metal. And so we got to the end of the
8:46
demo and I showed my cards and I was like, oh,
8:48
by the way, like the infrastructure we just deployed
8:50
and the containers we just deployed, that was
8:52
all bare metal. And the company immediately did
8:54
a 180. And they're like, okay. You've sold us.
8:57
Let's talk about bare metal now. Because they
8:59
realized it was one of those things where, you
9:01
know, kind of to the three points that I just
9:03
made about cost, data sovereignty and performance.
9:05
The main item for them was data sovereignty.
9:07
But when they realized that now they could have
9:10
that cost and performance conversation at the
9:12
same time, and it wasn't necessarily exclusive,
9:14
it was kind of eye opening. So yes, back to the
9:18
question that you asked, which is what is the
9:19
big misconception? There's a lot of people that
9:22
have spent much of their careers in the cloud.
9:26
I mean, again, it's been the main way we've been
9:27
deploying for a decade now. That because of that,
9:31
they've kind of, what's the whole cliche? Going
9:33
through the forest or the mountains or you're
9:35
missing, I don't remember the cliche, but the
9:37
point is that when you're so focused on something,
9:39
you kind of miss some of the advancements that
9:42
are happening in the process. Yeah, for sure.
9:44
I mean, so I am one of those people where I was
9:47
early days, I worked at a colo facility, racking
9:51
and stacking, you know, servers. very, very different
9:54
world now than it was back in the late 90s, early
9:57
2000s. Yeah, very different as far as management
10:00
of those colo facilities. Completely different
10:02
now. With those colo facilities, it's really
10:05
interesting because so many of the people that
10:07
I encounter that are bare metal friendly and
10:09
like ready to return to it are the people that
10:11
were racking and stacking back at that time.
10:13
And, you know, the early cPanel days and they're
10:16
like, I want to get back to that because they
10:17
kind of missed that. Like it means something
10:19
to them. But then you have this whole generation
10:21
kind of in the middle where cloud was the beginning
10:23
of their career. So as you talk about racking
10:26
and stacking, like that is dear to me as well.
10:29
Yeah. So when you say turn any infra into a private
10:32
cloud, what does that mean for a DevOps team
10:35
day two? So with Cycle, the way we kind of approach
10:39
things, from a philosophical standpoint, I'm
10:42
kind of anti-infrastructure as code. Like in
10:45
Cycle, we call it environments as code. The idea
10:47
is that you have a thin but defined line between
10:49
infrastructure and everything that's going to
10:51
run on top of that. And so in Cycle, we have
10:53
what we call environments as code, which is...
10:55
called stacks. And that's where you define your
10:57
load balancers, containers, everything that you
10:59
are going to be deploying over and over and over.
11:01
But then infrastructure, we treat as just a pool
11:03
of resources. So I guess maybe that's kind of
11:05
a throwback to the racking and stacking days that we're
11:08
just talking about where you're taking cPanel
11:09
and trying to pack these hosts with as many
11:12
things as you can and so with Cycle the idea
11:15
is that you know you bring bare metal you bring
11:17
vms whether it's in the cloud or we have we have
11:20
some companies that are running Cycle on top
11:21
of vmware the idea is you bring raw compute a
11:24
cycle will automatically join it into a mesh
11:25
and then that becomes kind of your private cloud
11:28
so it can be hybrid infrastructure multi-cloud
11:30
fully on-prem or a mixture of all of the above
11:34
And then applications that you're building, your
11:35
networks, et cetera, just overlay on top of that.
11:38
So again, you have this thin but very defined
11:40
line between infrastructure on one side and your
11:43
applications on the other. So one of the things
11:45
that I believe I read on, I believe it was on
11:47
your website, you say no DevOps army required.
11:50
What does that mean in practice? So I've been
11:53
writing code for, I don't know, probably 22 years
11:56
now. I guess there's the common saying that the
11:59
best DevOps engineer will automate themselves
12:00
out of a job, whether that's true or not. up
12:02
for continuous debate, but I've been writing
12:05
code for a long time and it seemed like any time
12:08
it came to actually deploying that, whether it
12:09
was back in the cPanel days or with Docker or
12:12
Kubernetes, et cetera, that was always the thing
12:13
that slowed me down the most. And while I always
12:16
loved working with infrastructure, I didn't really
12:18
like maintaining it. That was always a thing
12:20
like, I mean, sure, getting it up and running
12:22
is fun, right? I mean, it's why I think most
12:24
of us get into it. The first days of setting
12:26
up infrastructure and, you know, placing an order
12:27
to buy a new bare metal machine. That's exciting,
12:30
right? But, you know, three months later, no
12:32
one wants to maintain it anymore. And so the
12:34
goal cycle was how can we how can we build how
12:37
can we build a platform that allows developers
12:38
to do the things that typically you would need
12:40
a DevOps engineer to do? And I'm not saying that,
12:43
you know, to fully replace DevOps engineers,
12:45
but in many cases, companies that are coming
12:48
to us are companies where it's very engineering
12:51
heavy. So, you know, like 10 developers to every
12:53
DevOps engineer or a greater ratio. We have some
12:56
companies that are 25 to one, et cetera. And
12:58
those seem to be the places where we do best.
13:01
And one of my favorite components of that is
13:04
when we have developers that are deploying things
13:06
on top of the platform that aren't really DevOps
13:08
engineers at all. And they can't tell you how
13:11
they did something, but it works. And, you know,
13:14
that's one of the things that is nice about that.
13:17
So no DevOps army is required is just that
13:20
philosophy. Like, how do we empower developers
13:22
to do what they need to do without having to
13:25
become DevOps experts as part of that? So talking
13:28
about mixed workloads, can you give me like a
13:30
real world workload mix? Like, why do teams want
13:34
VMs and containers living together? Why would
13:36
they want that? There's probably... Hundreds
13:39
of different potential reasons. One company that
13:40
I was helping earlier this morning are running
13:42
all of their different microservices in containers.
13:45
That's just how they built their platform. But
13:49
then due to a whole bunch of legacy, their company
13:51
that's been around for. I think, 18 years. They
13:53
have a number of applications that are built
13:55
on top of .NET and require Microsoft SQL and
13:58
things like that. For them, all of their newer
14:00
microservices are running inside of containers.
14:02
But for them being able to have those legacy
14:06
applications running inside of Windows VMs sitting
14:10
on the same infrastructure, on the same network
14:12
where to the containers, they don't know they're
14:14
talking to a VM and vice versa. It's just fully
14:16
abstracted. So everything just sees network endpoints,
14:19
but it allows companies not have to change
14:22
everything for adoption. So that's one. And I
14:25
guess another use case is we have some companies
14:27
that are in regulated industries and things like
14:30
that where. For certain applications, they need,
14:34
you know, true virtualization for isolation as
14:37
opposed to, you know, just cgroups and, you
14:39
know, the isolation that comes with containers.
14:42
Yeah. I feel like as an industry now too, everybody
14:46
wants to containerize every service and makes
14:49
sense in a lot of cases. But I've also found
14:51
that there are cases where putting it inside
14:53
of Kubernetes or some sort of container orchestration
14:56
layer doesn't always make sense. We're just doing
14:59
it because that's the phase right now. It just
15:02
seems like every trade show I go to, too, it's
15:04
pushing this idea of containerization, which,
15:06
again, microservice layers make sense. There's
15:09
certain applications, though, where I don't think
15:11
putting bridges and stuff inside of a container
15:14
makes sense, especially when you're dealing with
15:16
services that can't be disrupted. It's an issue
15:18
that I've come across a lot in the last year
15:21
or two. Yeah, I mean, I would say that I'm probably
15:23
on the opposite side of that. I containerize
15:26
everything that I possibly can. But I mean, I
15:30
also put the platform for it. So, you know, kind
15:31
of aligns with my belief there. But yeah, I containerize
15:34
everything to the point that on Cycle, every
15:38
virtual machine is also actually a container.
15:41
And so that way we have container layer, we have
15:44
a containerized hypervisor, and then you
15:49
have your VM sitting inside of that. And so the
15:50
nice thing about that is that as we roll out
15:52
new versions of the hypervisor, you know, given
15:55
that it's containerized from that perspective,
15:57
we're able to kind of roll out different versions
15:59
of the hypervisor as users opt into it. So, you
16:02
know, kind of little benefits from that. So that's
16:06
interesting. So let's talk, I guess, a little
16:07
bit more about the containerization aspect. In
16:10
Kubernetes, I'm dealing with PDBs, pod disruption
16:13
budgets. I'm dealing with maybe Karpenter if
16:15
I'm on AWS. So I'm having this scale up and down
16:18
nodes as needed. And then if I'm dealing with
16:21
spot instances, I could have disruptions to...
16:24
workloads how does cycle handle that orchestration
16:27
of services so there's two ways that cycle approaches
16:31
that number one is that in general i believe
16:34
that the idea of failover is a terrible idea
16:36
and i've seen so many times in my career where
16:39
you know you set up all these processes for failover
16:40
and then things fail to fail over when things
16:43
go wrong right and so um with that you know the
16:47
idea that failover is you know kind of high risk.
16:50
It's also along the same kind of philosophy that,
16:55
you know, less moving parts is better. So inside
16:58
of cycle, when we talk about like stateful workloads
17:01
and things like that, in general, the approach
17:04
that we have is to run everything in active active
17:06
100 % of the time. So that way, if one side goes
17:09
down, you're perfectly fine. Like, you know,
17:11
when that side comes back up, you know, you'll
17:13
recover. Now, if you need to evacuate, you know,
17:15
to another host, you can do that too. But there's
17:17
a reason why, Mongo and so many of these modern
17:20
database technologies have their own replication
17:22
built into it. And I guess maybe that's a little
17:24
bit of a tangent is it bothers me so much when
17:27
platforms try to automate storage replication,
17:30
as opposed to letting the application that knows
17:32
how to replicate it properly, replicate it. And
17:35
so in Cycle's world, we try to, everything is
17:38
active, active. We try to move things as little
17:41
as possible. And then if you are deploying a
17:43
container and you mark it as stateful, the platform
17:45
will treat that container separately, almost
17:47
like a VM. It'll almost treat it like a VM.
17:49
So that way its data will always move with it.
17:51
Like we will like, instead of like creating like
17:53
a volume claim in Kubernetes, we will create
17:55
an attached LVM, like a raw LVM, and then migrate
17:59
that as needed. And if you need to scale up,
18:01
we'll scale up, but we're going to rely on the
18:04
underlying application to know how to replicate
18:05
that data. Because again, if we're talking about
18:07
like Mongo as an example, being able to have
18:09
collection-level locking as you do migrations
18:12
and things like that, as you have elections happening.
18:16
great why should a platform try to do that in
18:18
general what would be like a best bare metal
18:20
use case right now given the trajectory of the
18:24
industry i mean i think that For companies that
18:28
are trying to reclaim ownership of their stack,
18:32
bare metal is a great way of doing that. At that
18:35
point, you're just consuming raw compute. The
18:38
different services that you're kind of locked
18:42
into are significantly less. I mean, I guess
18:45
it depends on what provider you're going with
18:46
and things like that. But for the most part,
18:48
your pricing is going to be significantly more
18:49
stable. For example, we just signed a partnership
18:52
today to be announced soon that includes 100
18:55
terabytes of bandwidth per service. server on
18:57
a 10 gig link. And, you know, I mean, you know,
18:59
when you start talking about 100 terabytes
19:01
of bandwidth of bandwidth,
19:03
per, you know, VM or, or whatever
19:07
AWS, you're talking about a huge bandwidth bill.
19:10
Uh, so, uh, the fact that some of these, these
19:13
bare metal providers are including a 100
19:15
terabytes, um, per server
19:18
just out of the box with normal pricing is wild
19:21
um and maybe i should take my earlier response
19:25
back and instead talk about performance um and
19:28
performance density right because like we as
19:32
a company we've been buying some more bare metal
19:33
um recently for we're getting ready to launch
19:35
a european control plane so we're buying more
19:37
infrastructure for that control plane and the
19:40
fact that we were able to buy physical machines
19:43
with 24 physical cores for $280 a month with
19:50
dual 10-gig links and 100 terabytes of bandwidth
19:53
included and 192 gigabytes of memory each and
19:58
RAID 1 with terabyte NVMe drives. Like, I mean, those
20:04
same specs would be, I mean, I don't know the
20:08
exact number, but I'm guessing probably three
20:10
to four times over at AWS. And then you'd still
20:12
have the virtualization overhead on top of that.
20:15
What do you think is the reason for that? Do
20:18
you think that these colo or server providers
20:22
are trying to compete with that market? Or what's
20:26
the reason for the discrepancy in pricing, you
20:28
think? Is it just because we're paying so much
20:31
for the control plane at AWS? That's my theory.
20:33
I don't actually know. It's wild that you can
20:37
buy that level of performance for that price.
20:41
And yeah, like, yes, there's... One of the cons
20:43
that you typically get from most of these, these
20:46
kind of bare metal providers today, one of the
20:48
cons is that from a network perspective, you
20:50
have way fewer PoPs, right? So your latency from
20:53
a network standpoint can be meaningfully higher
20:56
where if you're using AWS and GCP, I mean, you'll
20:59
have great latency to almost anywhere in the
21:00
world where some of these bare metal providers,
21:02
you might get one or two pops. Now, granted,
21:04
there are some like Equinix and Megaport that
21:06
that's what they do and they can still give you,
21:09
you know, a really solid network. So there's...
21:11
you know, uh, pros and cons, but I don't know
21:14
if having 10-plus PoPs, you know, at GCP is also
21:18
worth a three X price increase on compute. I
21:21
don't know if that's the reason why, but I think
21:24
the network is like the network resiliency is
21:28
probably the one con to bare metal today. Yeah.
21:31
I guess it matters what your workloads are if
21:33
you, if you're latency sensitive, but yeah. Like
21:36
if you're doing high traffic bidding applications
21:38
where you need to bid for, for. creative or something
21:41
i could see you needing to have that low latency
21:44
for that but i don't think that most people if
21:47
especially if you're serving websites or serving
21:49
a web app i don't think for the most part it
21:51
would matter but also i mean Level 3 is in
21:54
northern virginia in the same area as you know
21:57
you could probably buy colo space again i'm sure
22:00
northern virginia area is is expensive just because
22:03
dc metro area but You could probably buy Colo
22:07
servers in that area competitively against AWS's
22:10
rates as well. Well, it's kind of funny that
22:12
you mentioned that. Because of AWS's presence
22:16
there, we have so many companies that when they're
22:19
switching to bare metal and using Cycle as part
22:21
of that process, we have companies that specifically
22:23
need bare metal in that region. And it's like,
22:26
hey, you can deploy bare metal anywhere in the
22:28
world. And they're saying, no, no, no, no. I
22:30
need to connect to Supabase or... some other
22:32
service running on top of AWS. So even though
22:35
they're trying to get off of AWS as a cloud provider,
22:37
they still need to be in proximity to it because
22:40
so many of the services they're communicating
22:42
with are still sitting right
22:45
on top of that infrastructure. So we have companies
22:47
that did initial test deployments and they were
22:49
deployed to somewhere around New York City or
22:51
somewhere to Atlanta and they would find out
22:55
that extra. 12, 15 millisecond latency was too
22:59
much for them. And so they had, so like, it's
23:01
weird how there seems to be such a, and it's
23:06
only on US East 1. We don't have any other companies
23:09
that are like, oh, US East 2 or US West 1 is
23:12
where I need to be. But for whatever reason,
23:14
US East 1, we just have an extreme number of
23:17
companies that are like, I need to be close to
23:19
that. And it's kind of coincidental because that's
23:21
the one that always has an outage or at least
23:23
it seems. And it's the control plane for the
23:25
rest of the services. So what I have found is
23:28
when US East 1 goes down, if you have services
23:30
that are in US West 2, maybe like an auto-scaling
23:34
group that needs to scale up, you know, new instances,
23:37
it's not able to because it's not able to reach
23:39
out to the control plane, which lives in US East
23:40
1. Yeah. There's still marriage there as much
23:44
as they maybe try to say that they're isolated.
23:47
They're not. Yeah, that is something I've absolutely
23:49
noticed as well. When US East 1 goes down, you
23:52
just have to assume that everything else is also
23:54
impacted. Okay, so speaking about Cycle versus
23:58
Kubernetes, what do you think that Kubernetes
24:00
gets right? And where does it become self-inflicted
24:04
pain? So with Kubernetes versus Cycle and when
24:10
is kind of a better fit for one versus the other,
24:13
it really kind of comes down to the needs of
24:15
individual companies from a customization perspective.
24:18
For companies that are really, as I mentioned
24:21
earlier, kind of engineering heavy, that are
24:23
like, hey, we're mainly developers. We're very
24:27
microservice heavy. We might need object storage
24:32
or like the requirements that they have are pretty,
24:34
let's say, you know, commoditized, right? They
24:37
need some disk. They need some, you know, object
24:39
storage. They need kind of the primitives. That's
24:42
where Cycle really shines. But at the same time,
24:44
if you have companies that are like, hey, we
24:46
need people to run on. very specific hardware
24:50
with very specific kernel drivers and, you know,
24:53
things like that. That's where Kubernetes is,
24:55
you know, a better fit from that perspective
24:57
because with Cycle, we ship a standardized OS
24:59
to everyone. We don't provide SSH access. That
25:03
OS is made to be as dumb and as tiny as possible.
25:06
You know, it's 40 megabytes in size. So that
25:08
means that naturally it's going to be kind of
25:10
limited in terms of what it can do. Like our
25:11
goal is to target 80%, right? But if you have
25:15
like really specific infrastructure you need
25:16
to run or, you know, Supercomputing applications
25:18
or some of these big AI models that people are
25:22
buying a whole bunch of really sophisticated infrastructure
25:24
for these days. Cycle today is not built for
25:28
bring anything. We are built for more of bring
25:31
your typical x86 server you know with a few drives
25:34
in it we'll get it up and running we'll get you
25:36
what you need so it all comes back to whether
25:38
do companies need just basic primitives and they
25:41
don't want to be devops or do they have a very
25:44
specific list of requirements um that you know
25:47
extend from network to hardware to um maybe the
25:52
oa the host os that those nodes need to run people
25:55
who are coming to cycle are looking for an opinionated
25:57
answer it's why most of the companies that are
25:59
on Cycle today are companies that left Kubernetes for Cycle
26:01
like you know they went that route they tested
26:03
it um many of them were on Kubernetes for years
26:06
before they decided like i think that it's natural
26:09
for for all of us techies to want to play with
26:12
new technologies but at some point people are
26:14
like you know what i don't i don't want to play
26:15
with all the bells and whistles anymore i just
26:17
wanted to work yeah and it's something that you
26:20
know i kind of talked about often is you know
26:23
most people when they get their first smartphone
26:25
or at least you know for me and most of my friends
26:28
as we were growing up your first smartphone is
26:30
an android right um and i know i'm probably about
26:32
to piss off a whole bunch of people but you're
26:34
for you know typically your first phone is an
26:36
android phone because you want to you want to
26:37
customize it you want to you know play with it
26:39
you want to make it yours yeah yeah exactly um
26:42
but eventually people are like i don't really
26:44
care about that anymore i just want a phone that
26:46
works i don't want to you know like let me change
26:49
my background and i don't care and then they
26:50
switch to an iphone um and so that's kind of
26:53
what i've kind of always said about cycle like
26:56
Companies, you know, they're going to go play
26:57
with Kubernetes. They're going to go play with
26:58
Rancher. They're going to go kind of test out,
27:00
you know, the latest and greatest. They want
27:03
something where they can change every variable.
27:05
But at some point, it no longer becomes about
27:07
changing variables and playing with the latest
27:10
and greatest of everything. They just want to
27:11
get back to what they want to build. And so that's
27:14
where we built Cycle for companies that
27:15
are like, yeah, just give me a standard opinionated
27:19
platform and I'll just work with that. And so,
27:22
yeah. Yeah, it makes sense. So, okay, wrapping
27:26
up, what's one thing you'd do first if you had
27:29
to modernize on-prem or hybrid without blowing
27:32
up the team? This is going to go back to a conversation
27:35
we had, you know, 10 minutes ago, but containerize,
27:38
right? Like, you know, I'm a big fan of containerizing
27:42
everything. It makes applications typically way
27:44
more portable. And if your goal is to chase portability,
27:48
so that way you can kind of move to whether it's,
27:51
you know, VMs or bare metal or on-prem, I mean,
27:55
I guess that's bare metal too. Or mixture. Being
27:57
able to standardize containers is, you know,
27:59
I think why I'm in such favor of them. Yeah,
28:02
I think that would be my first step. I think
28:05
that would absolutely be my first step from that
28:07
standpoint. Now, if the question is, what if
28:09
I'm already containerized and now I'm trying
28:11
to like go towards, you know, bare metal? I think
28:14
the next step there is, you know, look at your
28:17
dependencies. If you're in a hyperscaler. And
28:20
you're using things like Lambda and S3 and things
28:23
like that. Your next step is kind of to decide
28:25
what services I'm going to try to replicate on
28:27
top of bare metal. What am I going to be okay
28:29
having still in the cloud? And do the primitives
28:33
that I need allow me to be on bare metal? Because
28:35
again, if you are tightly integrated into Lambda
28:37
and some of these other things, bare metal might
28:39
not be the best fit for you. But if you're just
28:41
running a whole bunch of containers, then you're
28:43
kind of in a good spot. So assuming you're containerized,
28:46
I think then the next part of that is just...
28:48
evaluating what third-party services you need
28:50
and whether you can bring those in-house or
28:52
whether you're okay with those staying in the
28:53
cloud. Have you noticed an increase in AI workloads
28:59
and people building containers around AI-specific
29:02
workloads using Cycle? So it's kind of interesting
29:06
for how much, you know, you log into LinkedIn
29:09
and Reddit and things and everything's about
29:11
AI. So many of the companies that are on Cycle
29:13
today, like, yes, you know, a number of them
29:16
do have some AI component. in what they're doing.
29:19
But I think we only have one or two clients that
29:21
are running like true models with GPUs. Yeah.
29:25
I didn't mean like API calls to OpenAI, but
29:27
yeah, like actual modeling locally. Yeah. Yeah,
29:30
so Cycle out of the box supports all NVIDIA data
29:35
center class GPUs. And so that's one of the drivers
29:37
that we keep up to date. So we support that.
29:39
But we were hoping that we'd have more AI, you
29:43
know, focused applications on top of the platform
29:45
today. But I think that it's one of these kind
29:47
of interesting things where it's kind of like
29:48
with IoT and then crypto and now AI. Like, yes,
29:53
you know, there are hype waves. And I'm not saying
29:55
that AI is, you know, you know. equivalent to
29:59
IoT or crypto or things like that but as we see
30:02
these hype waves happen we always see people
30:06
who kind of love bleeding edge technology chase
30:10
them right like so many of my friends that were
30:12
like really gung-ho on on crypto are now the
30:15
same people that are really deep into you know
30:18
ai and then the next thing that comes out they're
30:20
going to chase that as well like it's what they
30:22
do and those people typically are the same people
30:25
that like changing all the, you know, the dials
30:30
and they like playing and tweaking. And so that's
30:32
where they kind of, I think those people, they
30:34
don't want an opinionated platform like Cycle.
30:35
They want something where they can customize
30:37
everything. It's kind of like the book of, you
30:40
know, Crossing the Chasm, right? Cycle is there
30:43
for the majority in the middle. They're like,
30:45
hey, no, I just want to run stuff. I want to
30:46
build stuff. But then, you know, those super
30:49
early adopters that are always chasing new technology,
30:51
they want to be, they want more customization
30:54
than Cycle will give them. And I guess to an
30:57
earlier point, that's where Kubernetes probably
30:58
makes more sense for them. But when applications
31:00
kind of become more standardized and they just
31:02
want to run them, then that's where Cycle starts
31:04
to win. Makes sense. What is Cycle? Why would
31:07
someone choose Cycle? What would be a reason
31:10
for choosing Cycle? Yeah, so for an organization
31:13
that is really engineering heavy, a company that
31:17
wants to spend more time focused on building
31:19
versus maintaining. And when I say building versus
31:22
maintaining, I'm talking about building the actual
31:24
applications, the platforms, the services, etc.
31:26
Less on maintaining the host OS, the host kernel,
31:29
the underlying infrastructure. That's where Cycle
31:32
really shines for these organizations, especially
31:34
teams that where they have. really talented developers,
31:38
but these developers really don't have interest
31:40
in becoming DevOps engineers as part of that
31:42
process is kind of where Cycle shines. Makes
31:45
sense. Okay, so wrapping up, where can people
31:48
read more about you, find out about you, read
31:50
more about Cycle? Where should they go? Yeah,
31:52
so to learn more about Cycle, you can visit our
31:54
website, which is Cycle.io. We also have a Slack
31:57
community that we have a lot of developers and
31:59
DevOps engineers that hang out in. That's
32:02
slack.cycle.io. And then for people who want to maybe
32:05
learn more about me, I guess linkedin.com/in/jakewarner. Like it's kind of weird
32:11
to hand out my or to use my LinkedIn as that
32:14
as the primary source. But I think I keep it
32:17
more up to date than anything else I do these
32:18
days. That makes sense. I'll also put links for
32:21
all of that in the show notes as well. So make
32:23
it easier. Sounds good. Awesome. Well, thank
32:27
you, Jake, so much for coming on and telling
32:29
me more about Cycle and containerization. Really
32:31
appreciate it. Yeah, really, really appreciate
32:34
you having me on, Brian. Really enjoyed the conversation.
32:37
Always fun to be able to, you know, as you mentioned
32:40
earlier about racking and stacking, meet people
32:42
of like mind in terms of how we got to where
32:44
we are today. Awesome. Thanks. All right. That
32:48
was my conversation with Jake Warner from Cycle
32:50
.io. My biggest takeaway from this one. is that
32:53
the cloud versus bare metal debate is kind of
32:56
the least interesting version of the conversation.
32:59
The better question is, what are you actually
33:01
trying to optimize for? Because sometimes the
33:04
answer is cloud. Sometimes it is Kubernetes.
33:07
Sometimes it is managed services all the way
33:09
down because your team does not have the time,
33:12
people, or business reason to own more of the
33:15
stack. But sometimes the answer is different.
33:18
Sometimes the problem is cost. Sometimes it is
33:21
performance density. Sometimes it is data sovereignty.
33:24
Sometimes it is compliance. Sometimes it is latency.
33:27
Sometimes it is the fact that your developers
33:29
just want to ship software and your platform
33:32
team is slowly drowning in a pile of abstractions
33:35
that were supposed to make everything easier.
33:38
That is the part I think is worth paying attention
33:40
to. A lot of teams do not necessarily want to
33:43
go backwards. They want to go back to owning
33:46
the parts that matter without reintroducing all
33:49
of the old pain. And that is a much more useful
33:51
framing than pretending there is one correct
33:54
infrastructure model for everybody. I also like
33:57
Jake's point about opinionated platforms. Because
34:00
as engineers, we love flexibility. We love knobs.
34:04
We love knowing that technically, if we really
34:06
wanted to, we could customize every piece of
34:09
the stack. But there is a cost to that. Every
34:12
knob becomes a decision. Every decision becomes
34:14
something to document. Every exception becomes
34:18
something to support. And eventually, the platform
34:21
that was supposed to help teams move faster becomes
34:24
another system that needs its own platform team
34:27
just to keep it sane. That does not always mean
34:29
opinionated platforms are always better. If you
34:32
need very specific kernel drivers, specialized
34:35
hardware, deep customization, or you are doing
34:39
weird edge case infrastructure work, then Kubernetes
34:42
or a more flexible platform may absolutely be
34:45
the better fit. But for a lot of teams, especially
34:48
teams that mostly need containers, VMs, networking,
34:52
storage, load balancers, and a sane way to deploy
34:55
applications, there is a real argument for fewer
34:59
choices and better defaults. And honestly, that
35:02
is probably where a lot of infrastructure conversations
35:04
are heading. Not everything should be cloud.
35:07
Not everything should be Kubernetes. Not everyone
35:09
should move back to bare metal. More like what
35:12
complexity is actually helping us? And what complexity
35:16
are we just carrying? Because the industry told
35:19
us this is what modern infrastructure is supposed
35:22
to look like. I'll have links to Jake,
35:25
Cycle.io, and their Slack community in the show notes.
35:28
If you enjoyed this conversation, follow or subscribe to
35:31
Ship It Weekly wherever you listen to podcasts.
35:34
It helps the show, and it makes sure you get
35:36
both these conversation episodes and the weekly
35:39
DevOps, SRE, platform, cloud, and security news
35:42
recaps. You can also find the show notes and
35:45
links over on shipitweekly.fm. Thanks for listening,
35:48
and I'll see you later this week.
For this Conversations episode, the thing I kept coming back to is that bare metal is not really the story.
The story is that a lot of teams are tired of paying for complexity twice.
First in the bill. Then again in engineering time.
The cloud made infrastructure dramatically easier in a bunch of ways. I do not think anyone who has actually racked servers, waited on hardware, dealt with colo networking, or tried to manage a random pile of machines should pretend the old world was simple. It was not. There is a reason the cloud won. It gave teams APIs, managed services, fast provisioning, easier experimentation, and a way to stop treating every infrastructure change like a procurement project.
But now a lot of teams are far enough into the cloud era that they are seeing the second-order effects.
The bill is one part of it. Bandwidth, managed service pricing, data transfer, always-on environments, overprovisioned Kubernetes clusters, and a bunch of “we’ll clean that up later” infrastructure that somehow becomes permanent.
But the bigger part, at least to me, is the platform complexity tax.
You start with a simple goal. Run the app. Deploy it safely. Scale it. Keep it reliable. Keep cost under control.
Then a few years later, you have Kubernetes clusters, managed databases, object storage, IAM policies, Terraform modules, Helm charts, GitOps controllers, autoscalers, service meshes, CI/CD runners, observability agents, secrets systems, and a platform team trying to make all of that feel like one coherent developer experience.
And sometimes that is the right tradeoff.
But sometimes it becomes this weird situation where the cloud was supposed to make infrastructure easier, and now the team has built a private bureaucracy on top of managed services. Every abstraction has an owner. Every exception has a support path. Every “simple” request turns into a thread with five teams, three Terraform repos, and someone saying, “I think this is handled by the platform.”
That is why I liked this conversation with Jake.
He is not making the lazy “cloud is dead” argument. That argument is boring, and usually wrong. Cloud is not going anywhere. Kubernetes is not going anywhere. Managed services are not going anywhere. For a lot of teams, they are still the correct answer.
But the more useful question is: what are you actually optimizing for?
If the answer is speed of experimentation, global managed services, low operational ownership, and you have the budget for it, the cloud might be the right answer.
If the answer is deep customization, unusual workloads, custom kernel requirements, or very specific infrastructure control, Kubernetes might still be the right answer.
But if the answer is predictable cost, performance density, data sovereignty, simpler primitives, and giving developers a sane path to deploy without building a giant platform engineering machine around it, then private cloud and bare metal start getting a lot more interesting.
Not because bare metal is nostalgic.
Not because racking servers was secretly fun.
It was fun for about four hours. Then it became inventory, cabling, firmware, network weirdness, drive failures, and “who has the crash cart?”
What is interesting now is whether teams can get some of the economic and ownership benefits of bare metal without taking back all of the old operational pain.
That is where platforms like Cycle fit into the conversation. Not as “everyone should use this,” but as an example of a broader shift: teams want the infrastructure underneath them to be simpler, more predictable, and less dependent on a massive pile of glue code.
Jake’s point about opinionated platforms also stood out to me.
As engineers, we love optionality. We like knowing we can customize the thing. We like having access to every dial, every plugin, every escape hatch, every config field, every weird little setting that might matter someday.
But optionality has a cost.
Every knob is a decision. Every decision becomes tribal knowledge. Every deviation becomes something the platform team has to support. And eventually, the platform becomes less of a paved road and more of a choose-your-own-adventure book where half the endings page the on-call engineer.
That does not mean opinionated platforms are always better. Sometimes opinionated tools box you in. Sometimes they hide too much. Sometimes they are great until you hit the edge of the product and suddenly the workaround is worse than the original problem.
But there is a real argument for boring primitives and good defaults.
Especially for teams that mostly need to run containers, maybe some VMs, expose services, attach storage, manage networking, and deploy applications without turning every developer into a part-time infrastructure engineer.
The other part I thought was interesting was the gravity of cloud ecosystems. Even when teams want to move away from a hyperscaler, they may still need to stay near it. Jake mentioned companies wanting bare metal near us-east-1 because the services they depend on are still there. That feels very real. Infrastructure decisions are rarely clean. You can move the workload, but you may not move all the dependencies, all the latency requirements, all the third-party services, or all the operational habits built around the old platform.
That is the part that usually gets missed in the “should we leave the cloud?” conversation.
It is not just, can we run this somewhere else?
It is, what does this thing depend on?
What data does it need?
What services does it call?
What latency does it assume?
What operational model does the team already understand?
What managed services are we actually using, and which ones are just convenience glue we forgot became critical?
That is where the real work is.
So my takeaway from this episode is not “move back to bare metal.”
It is more like: periodically re-check your assumptions.
The cloud decision you made five years ago might still be right. Or it might be right for some workloads and completely wrong for others. Kubernetes might still be the right foundation. Or it might be an expensive control plane for apps that only needed a much simpler runtime. Managed services might be saving your team. Or they might be quietly locking you into cost and operational patterns nobody has revisited in years.
Infrastructure choices age.
Team size changes. Compliance changes. Cost pressure changes. Latency requirements change. The talent on the team changes. The business changes.
And when that happens, “this is how we’ve always done it” is not an architecture strategy. It is just drift with better branding.
That is why I think this private cloud and bare metal conversation is coming back. Not because the industry wants to rewind the clock, but because teams are trying to find a better balance between control and convenience.
More ownership, without becoming hardware janitors.
Better cost predictability, without building everything from scratch.
More performance, without turning the platform into a science project.
More developer self-service, without pretending every team wants to become DevOps experts.
That is the useful middle ground.
And honestly, that is where a lot of the best infrastructure conversations live. Not in declaring one model dead and another one the future, but in being honest about the tradeoffs, the cost, the people, and the operational reality after the architecture diagram becomes production.