01 · The Broken Migration
Imagine this scenario: Your security audit team has just handed down a new mandate. The security posture must be hardened immediately. No more public IP addresses on application servers. No more direct exposure to the public internet.
As a cloud infrastructure engineer or architect, you know exactly what to do. You spin up a new set of private subnets within your Oracle Cloud Infrastructure (OCI) Virtual Cloud Network (VCN). You migrate your compute instances from the old public subnets into these pristine, isolated environments. You tear down the Internet Gateway routes, strip away the public IPs, and look at your OCI console with a profound sense of accomplishment.
Your security team is thrilled. Your compliance officer signs off on the audit. Your workloads are perfectly insulated from external attackers.
There is only one problem: your applications are also completely broken.
- Automated
yumandaptpackage updates fail with connection timeouts - Datadog and OCI Management Agents stop reporting — monitoring dashboards go blank
- Node.js and Python apps throw fatal errors pulling dependencies from public registries during autoscaling
- Database backups to OCI Object Storage pile up locally as connections time out
- OKE worker nodes cannot pull container images from external registries
The workloads are private, yes — but they are also entirely cut off from the critical services they need to function.
One of the most common OCI networking misunderstandings is assuming that "private" means "completely disconnected." In production cloud architecture, absolute isolation is almost always an illusion. The real architectural challenge isn't removing internet access entirely; it's providing targeted, secure, and controlled outbound access without exposing your workloads to inbound threats.
02 · If an Instance Is Private, Why Does It Need External Access?
When we design networks on paper, it is easy to fall into the trap of thinking our application servers exist in a self-contained vacuum. We imagine a neat, linear flow where a user talks to a public load balancer, the load balancer talks to an application server in a private subnet, and the application server talks to a backend database.
In the real world, modern application architectures are deeply interdependent ecosystems. A compute instance running a modern enterprise application rarely operates in complete isolation.
Systems Hygiene
Linux and Windows instances need regular security patches. Without access to public repository mirrors for Oracle Linux, Red Hat, or Ubuntu, unpatched vulnerabilities pile up quickly.
Application Dependencies
CI/CD pipelines and init scripts pull from npm, PyPI, Maven, or NuGet. Autoscaling instances that cannot reach these registries fail under load.
Operational Tooling
CrowdStrike, Fluentd, Datadog, New Relic, and similar SaaS platforms all require outbound pathways to send telemetry to their cloud endpoints.
External APIs
Stripe payments, SendGrid email, Twilio SMS, and financial data APIs all require your private instance to initiate outbound requests to the public internet.
03 · The Mistake Many Teams Make
The core mistake many engineering teams make is treating network security as a binary switch: either an instance is on the public internet and completely exposed, or it is in a private subnet and completely blind.
When you configure a network with zero outbound capabilities, you inadvertently trade an external security risk for a severe operational risk. You prevent external attackers from reaching your servers, but you also prevent your servers from updating themselves against those very same attackers.
When production workloads break due to this total isolation, frustrated teams often implement dangerous workarounds. Out of desperation to fix a breaking production deployment, engineers might move instances back into public subnets, assign temporary public IPs, or open wide, unmonitored holes in their firewalls.
The goal of a senior cloud architect is to avoid these reactionary fixes by designing a network that acknowledges external dependencies from day one — allowing instances to reach out to the world while ensuring the world can never reach back in uninvited.
04 · What Actually Happens When a Private Instance Tries to Reach the Internet?
To understand why a private instance cannot natively talk to the outside world, we need to trace the actual mechanics of a network packet beneath the OCI console.
Let's trace a packet from a compute instance in a private subnet (10.0.2.0/24) with private IP 10.0.2.15, attempting to reach a public package repository at 198.51.100.45.
Step 1 — Instance forwards to the VCN router
The instance sees the destination is external and forwards the packet to its default gateway — the virtual router provided by the OCI VCN.
Step 2 — Route table evaluation
The VCN evaluates the Route Table associated with that private subnet. Under a strict "no internet" design, the route table likely contains only a local route for 10.0.0.0/16. With no 0.0.0.0/0 entry, the virtual router drops the packet and the connection times out.
Step 3 — The Internet Gateway cheat (that still fails)
What if you add a 0.0.0.0/0 route pointing to an Internet Gateway (IGW)? The packet still dies — for a different reason.
An OCI Internet Gateway functions using a one-to-one NAT model. It expects any packet passing through it to possess a valid, publicly routable source IP address. When your private instance sends a packet with source IP 10.0.2.15, the Internet Gateway cannot route it onto the public internet. Even if it did, the external target would have no way of routing the response back to a private internal network.
Why This Confuses So Many Teams
Teams conflate internet connectivity with public exposure. They assume that to talk to the internet, an endpoint must be part of the internet. They overlook that networking is inherently bidirectional, but routing can be strictly unidirectional. Public exposure means random scanners can initiate connections to your instance. Internet connectivity means your instance can initiate connections to external services.
To break this deadlock, you need a specialized component that bridges the private and public worlds — dynamically replacing your private source IP with a public IP on the way out, tracking the response, and routing it back to the originating instance — while blocking any connection that didn't start from inside your network.
05 · What Problem Does a NAT Gateway Solve?
This exact architectural challenge is why Oracle Cloud Infrastructure provides a managed NAT Gateway. NAT stands for Network Address Translation, and in the context of OCI, it acts as a secure, highly available, unidirectional gateway for your private subnets.
When you deploy an OCI NAT Gateway, Oracle automatically provisions a managed network appliance at the edge of your VCN, assigned a public IP address from Oracle's pool (or a reserved public IP you control).
Figure 1 · NAT Gateway Source NAT (SNAT) — private instance to public internet
Packet flow with NAT Gateway configured
Your private instance (10.0.2.15) attempts to reach the public package repository (198.51.100.45). The private subnet's Route Table is configured:
- The OCI virtual router matches
0.0.0.0/0and forwards the packet to the NAT Gateway. - The NAT Gateway performs Source NAT (SNAT) — strips private IP
10.0.2.15and replaces it with public IP203.0.113.88, recording the translation in its state table. - The modified packet travels over the public internet. The repository sees a valid request from
203.0.113.88. - When the response arrives, the NAT Gateway matches it against the active session, replaces the destination with
10.0.2.15, and routes it back through the VCN. - To your compute instance, the transaction is completely transparent. Updates download successfully.
06 · The Security Benefit Most Teams Overlook
The profound security benefit of this architecture is its strictly outbound-only nature.
Because the NAT Gateway is stateful, it only allows packets into the VCN if they are a direct response to a connection explicitly initiated by an internal resource. If an external attacker types the NAT Gateway's public IP (203.0.113.88) into a port scanner and tries to launch an attack, the NAT Gateway drops the traffic immediately. There is no active session matching that incoming traffic in its state table, so the packet is rejected.
By routing your private instances through a NAT Gateway, you achieve a perfect architectural compromise. Your instances gain the ability to pull patches, reach external APIs, and push metrics — yet they remain entirely invisible and un-scannable to the public internet.
07 · Why Use the Internet for OCI Services When You Don't Have To?
Once teams discover the NAT Gateway, they often treat it as a universal hammer for every outbound networking nail. They configure a single NAT Gateway, point the 0.0.0.0/0 route of every private subnet toward it, and consider the job done.
This introduces a different architectural anti-pattern: routing traffic destined for Oracle's own native cloud services out over the public internet.
Your instances likely stream logs to OCI Logging, push metrics to OCI Monitoring, retrieve secrets from OCI Vault, and stream multi-gigabyte database backups to OCI Object Storage. By default, API endpoints for these services resolve to public IP addresses owned by Oracle. If your route table simply points all external traffic to a NAT Gateway, a 500 GB backup flows out through the NAT Gateway, across the public internet, and back into the Oracle Services Network.
Choke Points
Routing terabytes through a single NAT Gateway creates unnecessary choke points, latency variability, and wasted throughput.
Compliance Risk
Sending sensitive backups and logs across public internet space is a red flag for HIPAA, PCI-DSS, and GDPR frameworks.
Data Sovereignty
In regulated industries, data must never traverse the public internet when an internal cloud path is available.
Routing OCI-to-OCI traffic over the internet is the cloud equivalent of leaving your office building, walking down the public street, and entering through the back door of the exact same building just to deliver a memo to the desk next to you.
08 · What Does a Service Gateway Actually Do?
To solve the inefficiency of routing OCI service traffic over the public internet, Oracle designed the Service Gateway — a secure, private, deterministic conduit between your private VCN subnets and the Oracle Services Network (OSN).
The OSN is a dedicated network space within each Oracle Cloud region that houses all of Oracle's platform and infrastructure services, including Object Storage, Autonomous Database, Key Management, Streaming, and Logging. Traffic routed through a Service Gateway never touches the public internet.
When your private instance needs to write a log or upload a backup, it queries the public DNS name for that service. However, instead of matching a generic 0.0.0.0/0 route, you configure your Route Table with a highly specific Service CIDR Label — such as All OCI Services in Region or OCI Object Storage. Oracle dynamically maintains the underlying IP addresses behind the scenes.
The Service Gateway routes packets directly across Oracle's internal physical fabric into the requested service endpoint — maximum throughput, ultra-low deterministic latency, and total isolation from public transit networks.
For banking, healthcare, or government infrastructure, the Service Gateway is not a luxury — it is a hard compliance requirement. You can construct an entirely locked-down "dark network" VCN with zero paths to the public internet, yet still leverage Oracle's PaaS and storage capabilities. During an audit, you can demonstrate that data in Object Storage never traversed a public network wire during transit.
09 · NAT Gateway or Service Gateway — Which One Should You Use?
Choosing between a NAT Gateway and a Service Gateway is not an "either/or" proposition. In well-architected cloud environments, these two gateways are complementary components working side-by-side.
The decision comes down to one fundamental question: Who owns the destination endpoint?
If the destination is a service owned and operated by Oracle within your cloud region, it belongs on the Service Gateway. If the destination is an external third-party service or located on the broader public internet, it belongs on the NAT Gateway.
| Operational Requirement | Destination Owner | Gateway |
|---|---|---|
| OS package updates (yum, apt) | Public internet mirrors | NAT Gateway |
| External APIs (Stripe, SendGrid, Twilio) | Third-party SaaS | NAT Gateway |
| npm, PyPI, Maven, Docker Hub | Public registries | NAT Gateway |
| Monitoring agents (Datadog, New Relic) | Third-party SaaS | NAT Gateway |
| Database backups to Object Storage | Oracle (OCI) | Service Gateway |
| OCI Vault secrets & keys | Oracle (OCI) | Service Gateway |
| OCI Logging & Monitoring | Oracle (OCI) | Service Gateway |
| Autonomous Database exports | Oracle (OCI) | Service Gateway |
10 · Practical Coexistence Scenario
In a production environment, a typical application tier subnet requires both gateways simultaneously. To implement this correctly, you must construct a deliberate, multi-line Route Table assigned to your private subnet.
Figure 2 · Layered route table — Service Gateway for OCI services, NAT Gateway for everything else
Because routers operate on the principle of Longest Prefix Match (matching the most specific rule first), any packet destined for an OCI service matches the precise Service Label rule and heads through the Service Gateway. Any packet destined for a third-party API or OS patch mirror falls through to the more general 0.0.0.0/0 catch-all rule and exits through the NAT Gateway.
By layering your routes this way, you optimize performance, eliminate unnecessary internet exposure, and ensure your application always has the exact pathway it needs.
11 · What Are the Most Common Connectivity Mistakes?
Over years of auditing production OCI environments, certain predictable architectural failures appear repeatedly:
- The "Lazy Public Subnet" Anti-PatternPlacing app servers or databases in public subnets because it's easier — then relying on Security Lists alone. One accidental firewall change can expose an internal database port to the entire internet.
- The Blanket 0.0.0.0/0 NAT Gateway RouteRouting all external traffic through NAT while ignoring the Service Gateway. Large backups and log streams saturate the NAT Gateway, causing intermittent drops for critical business API calls.
- Shared Route Tables Across Diverse SubnetsA single route table for public, private, and database tiers destroys segmentation. Public subnets need IGW routes; private subnets need NAT — sharing breaks one or the other.
- Forgetting the Return Path in Hybrid NetworksPointing private subnet routes to a DRG for corporate ranges but forgetting NAT for internet-bound traffic creates asymmetrical routing loops that are notoriously painful to diagnose.
12 · What Does Oracle Actually Recommend?
To build a highly secure, resilient, production-grade network in OCI, adhere to Oracle's verified networking blueprints:
1. Enforce "Private by Default"
Every backend workload — application servers, microservices, databases, caching layers — must reside exclusively in private subnets. Public subnets are strictly reserved for edge devices: external Load Balancers, API Gateways, and secured Bastion hosts.
2. Segment Your Route Tables Intentionally
Never share route tables between infrastructure tiers:
| Subnet Tier | Route Configuration |
|---|---|
| Public Subnet | 0.0.0.0/0 → Internet Gateway (IGW) |
| Private Application Subnet | All OCI Services → Service Gateway (SGW); 0.0.0.0/0 → NAT Gateway (NATGW) |
| Isolated Database Subnet | All OCI Services → Service Gateway (SGW); zero internet routes |
3. Embrace Network Security Groups (NSGs)
NSGs apply granular firewall policies directly to individual VNICs, grouping instances by operational role rather than subnet location — preventing accidental exposure when subnets scale.
4. Enable VCN Flow Logs and Traffic Monitoring
Enable OCI VCN Flow Logs on your NAT and Service Gateways. This provides an immutable audit trail of every outbound connection attempt, allowing security teams to spot anomalous exfiltration or compromised instances communicating with malicious IPs.
13 · The Three Traffic Types Every OCI Architect Should Identify
Before you touch the OCI console or write a single line of Terraform, categorize every network interaction into three distinct vectors. Every packet leaving a compute instance falls into one of these buckets:
Figure 3 · Three traffic vectors — NAT Gateway, Service Gateway, and DRG
Type 1 · Public Internet
Third-party APIs, OS patch repositories, SaaS logging providers. Unpredictable external destinations.
Tool: NAT Gateway
Type 2 · OCI-Native Services
Object Storage, Vault, Monitoring, PaaS tools. Predictable, region-locked Oracle resources.
Tool: Service Gateway
Type 3 · On-Premises
Corporate data centers, legacy databases, local identity servers. Hybrid cloud connections.
Tool: DRG (VPN / FastConnect)
Force Type 2 traffic through a Type 1 architecture and you introduce latency and compliance gaps. Force Type 1 traffic through a Type 3 architecture and you clog corporate WAN connections with generic internet downloads. Clear identification of these three pillars is what separates amateur network designs from battle-tested enterprise architectures.
14 · The Short Version — 7 Connectivity Decisions
If you are looking for immediate architectural takeaways to apply to your production OCI environments today:
- Private does not mean completely disconnected.Designing an absolute network vacuum creates severe operational liabilities that inevitably break production workloads.
- Use a NAT Gateway for public internet egress.It grants private instances the ability to initiate vital outbound connections while remaining invisible to inbound scanners.
- Use a Service Gateway for OCI-native services.Never send internal logs, metrics, or database backups through the public internet via NAT when Oracle's internal fabric is available.
- Never assign public IPs to application or database servers.Keep compute workloads in private subnets; reserve public subnets exclusively for edge entry points like Load Balancers.
- Enforce Longest Prefix Match in your routing.Layer route tables so specific Oracle service traffic splits cleanly from general public internet traffic.
- Segregate route tables by functional tier.Public, private application, and database tiers should never share a single generic route table.
- Plan outbound connectivity vectors on day one.Do not wait for a production scale-up event or a failed security audit to figure out how private workloads will download dependencies.
The architectural compromise that actually works
Hardening your OCI security posture by moving workloads into private subnets is the right decision. But "private" is not a synonym for "offline." Modern enterprise applications depend on patches, external APIs, SaaS telemetry, and OCI-native services — and each dependency requires a deliberate, secure outbound path.
The NAT Gateway and Service Gateway are not competing options. They are complementary tools that, when layered correctly in your route tables, give you the best of both worlds: workloads that are invisible to the public internet yet fully operational in production.