Why On-Premise AI Is Almost Always Safer

On OpenRouter [1], one of the largest platforms for AI model access, the second most popular model right now is StepFun: Step 3.5 Flash (free) [2]. 196 billion parameters. Mixture-of-Experts architecture. 256,000-token context window. Price: $0.00 per million input tokens. $0.00 per million output tokens. The word “free” is right there in the name. Completely free.

So who is paying for it?

“If you are not paying for it, you’re not the customer; you’re the product being sold.”

Andrew Lewis (2010) [3]

If someone is giving you inference on a 196B-parameter model for free, the real question is not whether your data is being monetized, but how. Every prompt you send to a free service has to be treated as potentially readable, whether for training, analytics, or simple resale.

And this goes beyond free models. The bigger question is this: who do you trust with your data while it is being processed?

When Credentials End Up on the Internet

This is not a theoretical question. It is measurable.

The OpenClaw Exposure Watchboard [4] currently documents more than 390,000 publicly reachable AI instances on the internet in real time. Many of them include leaked credentials. Hosted on Alibaba Cloud, Tencent, DigitalOcean, Hetzner, and dozens of other providers. The table reads like a who’s who of global cloud infrastructure, and at the same time like a catalogue of negligence.

Some examples from the dataset (as of March 2026):

Location	Provider	Credentials leaked	Known threat actors
Germany	Hetzner	Yes	APT28, APT29, Lazarus Group, Salt Typhoon
Singapore	Tencent Cloud	Yes	APT37, Cobalt Group, Kimsuky
United States	DigitalOcean	Yes	APT28, APT41, Volt Typhoon
Hong Kong	Lucidacloud	Yes	APT15, APT41, Salt Typhoon, Sandworm
Finland	Hetzner	Yes	APT28, APT29, Lazarus, Turla APT Group

These are not hypothetical scenarios. These are real, externally reachable systems with leaked access credentials and threat actor associations that include state-linked groups: APT28 (Russian military intelligence), Lazarus Group (North Korea), Volt Typhoon (China), Salt Typhoon (China).

With a properly isolated on-premise system, this simply cannot happen. There is no public endpoint to scan. No IP address that appears on a watchboard. No credentials sitting out on the internet.

Cloud-hosted AI instances are discoverable, scannable, and fully exposed the moment they are misconfigured. One firewall rule, one forgotten port, one default password, and your entire AI setup including prompts, documents, and credentials is out in the open. This is not an edge case. At 390,000 instances, it is a pattern.

Anyone who thinks this only happens in “experimental” cases like OpenClaw should take a look at McKinsey’s internal AI tool, “Lilli”: according to The Stack, security researchers were able in 2026 to use unauthenticated endpoints and a SQL injection flaw to access millions of chat logs, hundreds of thousands of private files, and internal RAG documentation. McKinsey confirmed the vulnerability, while also stating that it was fixed within hours and that the firm found no evidence of unauthorized access to client data [19][20]. Once again, the issue was not “AI” in the abstract, but classic application security failures around an AI-adjacent system.

Europe and Switzerland: Regulation as a Tailwind Without the Clouds

The European and Swiss regulatory landscape confirms what the technology already makes clear: data belongs where you can control it.

NIS2 and the EU AI Act

The NIS2 Directive [5] creates an EU-wide cybersecurity framework for critical sectors. It requires systematic risk management, supply-chain controls, and mandatory reporting for significant incidents. One point matters in particular: the directive explicitly emphasizes management accountability. Executives can be held personally responsible.

The EU AI Act [6][7] entered into force on August 1, 2024 and becomes broadly applicable on August 2, 2026. Governance obligations for general-purpose AI models have already applied since August 2025 (Art. 53). Anyone operating AI systems faces transparency and documentation duties, and those duties are much easier to meet when you know exactly where your data is and who can access it.

Switzerland: Reporting Duties and Fines

In Switzerland, cyberattacks on critical infrastructure have had to be reported to the BACS since April 1, 2025, within 24 hours of discovery [8]. Since October 1, 2025, failure to report can trigger fines of up to CHF 100,000 [9].

At the same time, the FDPIC [10] has clarified breach-notification duties under the revised Swiss Data Protection Act, including how to assess a “likely high risk” and how responsibilities are split between controllers and processors.

And this is the catch: even if a cloud provider promises “Swiss data residency” or an “EU region,” sub-processors, telemetry pipelines, CDN services, and support access from third countries can still create cross-border transfers. On-premise removes that complexity. Your data stays physically and legally inside your domain.

Switzerland is not currently pursuing a single comprehensive AI law. Instead, it is relying on sector-specific rules and plans to ratify the Council of Europe convention on AI. In practice, that means anyone operating AI in Switzerland still needs to keep both Swiss and, through extraterritorial effects, European requirements in view. On-premise simplifies both.

Why Anonymization Is Not Enough

Proxy services and browser extensions promise to “anonymize” prompts before they are sent to cloud AI. The problem is simple: you cannot anonymize semantics. In Opinion 28/2024, the EDPB made clear that AI models trained on personal data cannot automatically be treated as anonymous in every case [11].

Imagine an employee typing this into an AI chat:

“My client in Zurich has a tax back-payment of 2.3 million.”
“The patient in room 412 is showing symptoms of a rare autoimmune disease.”
“We are planning to acquire the competitor by Q3.”

No names appear, but anyone who knows the context can connect the dots. Pseudonymization protects against easy attribution, not against semantic reconstruction. OWASP ranks “Sensitive Information Disclosure” as the second most important LLM risk (LLM02:2025) [12]. Germany’s BSI explicitly warns about data exfiltration through model outputs and tool connectors [14]. At Samsung, proprietary semiconductor source code and meeting notes were fed into ChatGPT in 2023 within a span of just 20 days, with no way to take them back [13].

If the prompt never leaves the building, it does not need to be anonymized in the first place.

Zero Trust, or: Trust No One but Yourself

“Security is a process, not a product.”

Bruce Schneier (2000) [16]

The decisive question in AI security is this: who do you trust with the execution environment?

In security architecture, this is described as the Trusted Computing Base (TCB), the total set of components that must be trusted for the system to remain secure. The smaller the TCB, the safer the system.

Dimension	Cloud	On-Premise
Trusted Computing Base	Large: your admins + provider layers + hypervisor + multi-tenancy	Small: your admins + your hypervisor + dedicated hardware
Multi-tenancy	Structural; isolation depends on hypervisor and control plane	None; dedicated hardware removes the risk
Data during processing	Plaintext in RAM/VRAM; TCB includes provider layers	Plaintext in RAM/VRAM; TCB stays within your own domain
Privileged access	Tenant admins + provider operations = larger insider surface	Only your own admins with JIT access and MFA
Incident forensics	Dependent on provider log access and contract boundaries	Full chain of custody under your control

Neither cloud nor on-premise can fully protect “data in use” cryptographically today. During execution, data still exists in plaintext. So the strategic question is not “which encryption?,” but “who has access to the execution environment?”

In the cloud, the answer is: you, your admins, the cloud provider, its admins, its sub-processors, and every jurisdiction that can compel access. On-premise: you and your admins. Full stop.

BSI [15] and OWASP [12] both point to prompt injection, data exfiltration, and supply-chain compromise as the dominant LLM threats. All of them are structurally reduced by a smaller TCB and full network control, both of which are core strengths of an on-premise deployment.

In an Air-Gapped Environment, There Are No Clouds

In an isolated environment, there are no cloud providers. No sub-processors. No cross-border data transfers. No leaked credentials on watchboards. No free models paid for with your prompts. No jurisdiction questions. No “shared responsibility” model where, in the worst moment, nobody is actually responsible.

There is only your hardware, your network, your models, your data. Zero Trust in its purest form: no data leakage, no trust assumptions, no third parties.

References

[1] OpenRouter - model overview (sorted by popularity). https://openrouter.ai/models?order=most-popular

[2] StepFun: Step 3.5 Flash (free) on OpenRouter. https://openrouter.ai/stepfun/step-3.5-flash:free

[3] Andrew Lewis, MetaFilter (2010). Core idea traced back to Richard Serra and Carlota Fay Schoolman, Television Delivers People (1973). See also Quote Investigator: https://quoteinvestigator.com/2017/07/16/product/

[4] OpenClaw Exposure Watchboard. https://openclaw.allegro.earth/

[5] NIS2 Directive (EU) 2022/2555 of the European Parliament and of the Council. https://eur-lex.europa.eu/eli/dir/2022/2555

[6] EU AI Act - Art. 53: obligations for providers of general-purpose AI models. https://ai-act-service-desk.ec.europa.eu/en/ai-act/article-53

[7] EU AI Act - timeline and applicability. European Commission. https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai

[8] BACS/NCSC - reporting obligations for cyberattacks on critical infrastructure (Switzerland). https://www.ncsc.admin.ch/ncsc/de/home/meldepflicht/meldepflicht-info.html

[9] BACS/NCSC - six months of mandatory reporting for cyberattacks: interim assessment. https://www.ncsc.admin.ch/ncsc/de/home/aktuell/im-fokus/2025/meldepflicht-6-monate.html

[10] FDPIC - notification of personal data breaches under the revised Swiss Data Protection Act. https://www.edoeb.admin.ch/

[11] European Data Protection Board (EDPB) - Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models (December 17, 2024). https://www.edpb.europa.eu/our-work-tools/our-documents/opinion-board-art-64/opinion-282024-certain-data-protection-aspects_en

[12] OWASP - Top 10 for LLM Applications 2025, LLM02: Sensitive Information Disclosure. https://genai.owasp.org/llmrisk/llm02-insecure-output-handling/

[13] Samsung ChatGPT data leak (2023). The Register. https://www.theregister.com/2023/04/06/samsung_reportedly_leaked_its_own/

[14] German Federal Office for Information Security (BSI) - Generative AI Models: Opportunities and Risks for Industry and Public Authorities, Version 2.0 (January 2025). https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/KI/Generative_KI-Modelle.pdf

[15] German Federal Office for Information Security (BSI) - Evasion Attacks on LLMs: Countermeasures. https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/KI/Evasion-Angriffe_auf_LLMs-Gegenmassnahmen.pdf

[16] Bruce Schneier - The Process of Security (2000). https://www.schneier.com/essays/archives/2000/04/the_process_of_secur.html

[17] ENISA - Cloud Cybersecurity Market Analysis. https://www.enisa.europa.eu/sites/default/files/publications/Cloud%20Cybersecurity%20Market%20Analysis.pdf

[18] GDPR - official text. EUR-Lex. https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32016R0679

[19] The Stack - Startup’s agent hacked McKinsey AI - exposing huge volumes of sensitive data (March 12, 2026). https://www.thestack.technology/mckinsey-ai-agent-hacked-lilli/

[20] McKinsey & Company - Statement on strengthening safeguards within the Lilli tool. https://www.mckinsey.com/about-us/media/statement-on-strengthening-safeguards-within-the-lilli-tool