Why Mid-Market Companies Are Bringing AI Back In-House

Something unusual is happening in enterprise IT.

After a decade of relentless migration to the public cloud — after the conference keynotes, the lift-and-shift projects, the promises of infinite scalability — a growing number of mid-market companies are quietly moving their workloads back.

Not all of them. Not everything. But enough to make the trend impossible to ignore.

The numbers tell a striking story. Roughly 70% of organizations are either actively repatriating workloads to private infrastructure or planning to do so within the next year.

Among mid-market companies specifically, the figure is even more pronounced: 97% of mid-market organizations plan to shift select workloads away from public cloud environments.

These are not fringe players or cloud skeptics. They are companies that went all-in on public cloud, ran the experiment for three to five years, and arrived at the same uncomfortable conclusion: the economics do not work for every workload, and the tradeoffs are steeper than the sales pitch suggested.

This is not a rejection of cloud computing. It is a correction.

The Bill Comes Due

The original promise of public cloud was compelling, especially for mid-market companies without large IT departments. No capital expenditure. No hardware to maintain. Pay only for what you use.

For many workloads, that model still holds. Bursty applications, experimental projects, seasonal demand — these are genuinely good fits for elastic, consumption-based pricing.

But AI changed the equation. Running inference workloads, training models on proprietary data, and maintaining always-on GPU clusters do not behave like web applications that scale up on Black Friday and wind down by Tuesday.

They are steady-state, compute-heavy, and expensive. An NVIDIA H100 instance on AWS runs around $3.90 per GPU per hour on-demand. Azure charges closer to $7.00.

For a mid-market company running even a modest AI workload around the clock, the annual bill can climb past seven figures before anyone in finance thinks to ask questions.

And then AWS raised GPU prices 15% in January 2026, bumping certain instances from $34.61 to $39.80 per hour overnight — on a Saturday, no less.

Cloud costs have become the second-largest expense at midsize IT companies, trailing only labor. Managing that spend is now the primary challenge for 82% of cloud decision-makers.

Part of the problem is visibility: over 20% of organizations admit they have little to no idea how much different aspects of their business actually cost in the cloud.

The rest of the problem is waste. Studies estimate that 28% to 35% of cloud spending goes to idle resources, misconfigurations, and orphaned storage artifacts. An industry report from Harness projected $44.5 billion in infrastructure cloud waste for 2025 alone.

The poster child for this reckoning is 37signals, the company behind Basecamp and HEY. They spent $3.2 million a year on AWS, bought roughly $700,000 in Dell servers, and within a year had recouped the hardware investment entirely.

Their projected savings over five years exceed $10 million.

As their CTO David Heinemeier Hansson put it, the cloud premium only makes sense if your workloads are genuinely unpredictable. For stable operations, you are paying a tax on someone else's flexibility.

Mid-market companies are arriving at the same math, just with less public fanfare.

Sovereignty Is No Longer Optional

Cost would be reason enough. But the regulatory landscape has made the conversation urgent in ways that spreadsheet analysis alone could not.

The global patchwork of data sovereignty rules has become genuinely difficult to navigate. The EU's General Data Protection Regulation was the opening act.

Since then, China's Personal Information Protection Law, India's Digital Personal Data Protection Act, and a proliferation of U.S. state-level privacy laws — California's CCPA, Virginia's CDPA, Colorado's CPA, and others — have created a compliance maze that multiplies with every jurisdiction a company touches.

The EU's Digital Operational Resilience Act, which took effect in January 2025, now requires financial institutions to manage ICT risk more directly, including how and where they store data.

For a mid-market company in healthcare, financial services, or legal, the question is no longer theoretical. Where does your data physically reside? Who has access to it? Under which country's laws can it be subpoenaed?

When your AI models train on client records, patient data, or transaction histories, the answers to these questions become matters of regulatory compliance, not just IT preference.

Sixty-five percent of business leaders report that they have changed their cloud strategies in response to geopolitical pressures, and 75% express concern about the jurisdictional risks of storing data in global cloud environments.

Gartner has taken to calling this trend "geopatriation" rather than repatriation — a distinction that reflects the regulatory, rather than purely economic, motivations driving the shift.

Their analysts forecast that sovereign cloud IaaS spending will reach $80 billion in 2026, with roughly 20% of current workloads shifting from global to local providers.

At least 41% of organizations have already begun repatriating some data from public cloud to on-premises or local environments. For companies handling sensitive data in regulated industries, that number will only climb.

The AI Infrastructure Gap

There is a particular irony in the current moment. The technology driving the most demand for compute — artificial intelligence — is also the technology making public cloud economics least defensible for sustained workloads.

AI workloads are not like traditional cloud applications. They require dedicated GPU time, massive data throughput, and consistent availability.

They do not lend themselves to the spot-instance, burst-when-needed model that makes public cloud pricing attractive.

A company fine-tuning a large language model on its own data needs predictable, sustained access to GPU clusters.

What it gets instead, on the major clouds, is volatile pricing — AWS alone recorded an average of 197 distinct monthly price changes on its spot GPU instances in 2025 — and in some cases, outright unavailability.

The hyperscalers are aware of this tension. AWS cut H100 pricing by roughly 44% in mid-2025, a move that acknowledged the growing competitive pressure from GPU-focused providers offering 50% to 70% cost savings over the Big Three.

But a price cut on a consumption model still leaves companies exposed to the fundamental problem: you are renting, not owning, and the landlord sets the terms.

For mid-market companies, this creates a strategic vulnerability. They are large enough to have real AI workloads — customer support automation, document processing, internal knowledge systems, compliance monitoring — but not large enough to negotiate the kind of reserved-instance deals that Fortune 500 companies extract from cloud vendors.

They sit in a pricing no-man's land: too big for the cloud to be cheap, too small for the cloud to be negotiable.

Gartner projects that 50% of critical enterprise applications will reside outside centralized public cloud locations through 2027. That prediction, made in late 2023, looks increasingly conservative.

The 90% of organizations that Gartner expects to adopt hybrid cloud approaches by 2027 are not doing so because hybrid is fashionable.

They are doing it because certain workloads — particularly AI workloads processing sensitive data — simply do not belong on shared, multi-tenant infrastructure where pricing, performance, and jurisdiction are outside their control.

What Private Actually Means Now

The conversation about private infrastructure has matured considerably since the last cycle. A decade ago, "private cloud" often meant a company buying servers, racking them in a closet, and hoping someone on staff knew enough Linux to keep things running.

That model was fragile, expensive, and rightly gave way to the public cloud wave.

What is emerging now is different.

The new generation of private AI infrastructure is purpose-built: GPU-dense racks designed for AI workloads, immersion cooling systems that cut energy use by 40% or more while extending hardware life by roughly 30%, and facilities sited for reliable power and low-latency connectivity rather than proximity to a company's headquarters.

These are not vanity projects. They are engineering responses to specific technical and economic constraints.

The energy dimension deserves particular attention. Traditional air-cooled data centers are reaching their thermal limits as GPU density increases.

A single rack of modern AI accelerators can draw 40 to 100 kilowatts — far beyond what conventional cooling was designed to handle.

Immersion cooling, which submerges hardware in thermally conductive fluid, is not a novelty anymore. It is becoming a practical necessity for facilities running AI workloads at scale.

It carries real operational advantages: lower power bills, quieter facilities, and hardware that degrades more slowly because it runs cooler.

Geography matters too. The Texas data center market illustrates the supply-demand dynamics at play across the industry. Despite record-high absorption in the first half of 2025, supply continues to lag demand.

Vacancy rates have declined for two consecutive years, and 78% of planned construction capacity is already pre-leased before facilities even come online.

For mid-market companies trying to secure reliable, U.S.-jurisdiction compute capacity, the window of easy availability is narrowing.

The Hybrid Reality

None of this means the public cloud is dying. Gartner forecasts $723 billion in worldwide public cloud spending for 2025, up 21% from the prior year. The cloud market will likely cross a trillion dollars by 2027.

Sid Nag, a Vice President Analyst at Gartner, has noted that cloud use cases continue to expand, with increasing focus on distributed, hybrid, and multi-cloud environments.

But growth in overall cloud spending and growth in repatriation of specific workloads are not contradictory trends. They are two sides of the same maturation process.

Companies are getting smarter about which workloads belong where. Development and testing environments, SaaS applications, and genuinely variable workloads will continue to thrive on public cloud.

But AI training on proprietary data, compliance-sensitive processing, and latency-critical inference — these are migrating to infrastructure where the company controls the hardware, the jurisdiction, and the bill.

For mid-market companies, the calculus is particularly clear. They cannot afford to waste 30% of their cloud budget on idle resources and misconfigurations. They cannot absorb unpredictable GPU pricing swings that throw off quarterly forecasts.

They cannot explain to regulators or clients why sensitive data sits on shared infrastructure in a jurisdiction they did not choose. And they cannot wait for the hyperscalers to solve these problems, because the hyperscalers' business model depends on those problems persisting.

The companies that are moving fastest are not the ones abandoning cloud entirely. They are the ones building a deliberate split: public cloud for what it does well, private infrastructure for what it must.

That split is not a retreat. It is the first genuinely strategic infrastructure decision many mid-market companies have made since they rushed to the cloud in the first place.

The repatriation trend will not reverse. If anything, as AI workloads grow more central to operations and regulatory scrutiny intensifies, the share of compute running on private infrastructure will only increase.

The question for mid-market companies is not whether to bring workloads home. It is whether they will do it on their own terms, or wait until costs and compliance force the decision for them.

Keep Reading

What Immersion Cooling Means for the Next Generation of Data Centers

· 8 min read

The Case for Private AI: Compliance Without Compromise

· 9 min read

Texas Is Becoming America’s AI Infrastructure Hub — Here’s Why

· 7 min read