Oracle 19c RAC: Real Application Clusters In Depth
Oracle Real Application Clusters (RAC) is the gold standard for active-active database high availability. After deploying and managing 19c RAC clusters for banking, pharma, and software-product customers across multiple countries, I can say with confidence: when configured correctly, RAC delivers zero-downtime architecture that no other database technology matches. This guide is the comprehensive technical walkthrough I wish existed when I first started with RAC.
1. What Is Oracle RAC and Why Use It?
Oracle RAC allows multiple Oracle instances on different physical/virtual servers to access a single shared database. Each instance — running on its own node — has its own memory (SGA), background processes, and resources, but they all read and write to the same set of datafiles stored on shared storage (ASM disk groups).
Why deploy RAC?
- High Availability: If one node fails, surviving nodes continue serving the database transparently
- Scalability: Add more nodes to handle increased load
- Load Balancing: Workload distributed automatically via SCAN listeners and services
- Zero-downtime patching: Rolling patches across nodes one at a time
- Hardware utilization: All nodes are active — no idle standby waste
2. RAC Architecture: The Big Picture
A typical 2-node Oracle 19c RAC has these components:
- Two or more nodes (physical servers or VMs) running the same OS
- Shared Storage — SAN or NAS visible from all nodes
- Private Interconnect — dedicated network (10 GbE+) between nodes for cache fusion
- Public Network — for client connections and SCAN
- Oracle Grid Infrastructure (GI) — Clusterware + ASM
- Oracle Database 19c — RAC option installed
3. Oracle Grid Infrastructure: The Foundation
Before you can install RAC, you must install Oracle Grid Infrastructure 19c. GI provides two critical services:
3.1 Oracle Clusterware (CRS)
The clusterware is the software layer that lets multiple servers act as a single cluster. Key components:
- Cluster Ready Services (CRS): The master daemon coordinating cluster resources
- Cluster Synchronization Services (CSS): Monitors node membership using voting disks
- Event Manager (EVM): Publishes cluster events
- Oracle Notification Service (ONS): Distributes alerts to clients
- ohasd: The first daemon to start; bootstraps everything else
Cluster status is checked with:
crsctl check cluster -all
crsctl status resource -t
crsctl status server
olsnodes -n -i -s
3.2 Automatic Storage Management (ASM)
ASM is Oracle's volume manager and file system for database files. It pools shared disks into disk groups with automatic striping, mirroring, and rebalancing. Critical disk groups in a typical RAC:
- +OCR/VOTING: Cluster registry and voting disks (NORMAL redundancy = 3 disks minimum)
- +DATA: Datafiles, controlfiles, online redo logs
- +RECO: Fast Recovery Area — archive logs, RMAN backups, flashback logs
4. SCAN: The Single Client Access Name
SCAN is one of the most elegant features of Oracle RAC. Instead of clients knowing every node's IP, they connect to a single DNS name (e.g., prod-scan.company.com) that resolves to 3 SCAN IPs round-robin. Three SCAN listeners run across the cluster nodes, accepting client connections and routing them to the appropriate node based on service registration and load.
Benefits: clients don't need configuration changes when nodes are added/removed. Add a node? It registers with SCAN automatically.
5. Voting Disks & OCR
- Voting Disks: Used for cluster membership decisions. If a node loses access to the majority of voting disks, it's evicted (rebooted) to prevent split-brain. Minimum 3 voting disks for NORMAL redundancy.
- Oracle Cluster Registry (OCR): Stores cluster configuration metadata — node list, services, resources. Backed up automatically every 4 hours.
6. Cache Fusion: How RAC Maintains Consistency
This is the magic of RAC. When Node A modifies a block, Node B needs the latest version. Instead of writing to disk then reading, RAC ships the block directly between instances over the private interconnect using Cache Fusion. The Global Cache Service (GCS) coordinates ownership; the Global Enqueue Service (GES) manages locks across nodes.
This is why low-latency private interconnect is non-negotiable. 10 GbE minimum, ideally 25 GbE or InfiniBand for high-throughput systems.
7. Installing Oracle 19c RAC — High-Level Steps
- Pre-requisites: OS (Oracle Linux 7/8), shared storage configured, time sync (chrony/NTP), DNS for SCAN, swap, hugepages
- Configure SSH user equivalence between nodes for
oracleandgridusers - Run cluster verification:
./runcluvfy.sh stage -pre crsinst -n node1,node2 -fixup - Install Grid Infrastructure 19c via OUI (gridSetup.sh) — choose "Configure Oracle Grid Infrastructure for a New Cluster"
- Create ASM disk groups for OCR/VOTING, DATA, RECO
- Install Oracle Database 19c software only as
oracleuser - Create the RAC database using DBCA — choose RAC database template
- Verify cluster health with
crsctl status resource -t - Apply latest Release Update patch (always run latest RU)
8. Essential RAC Management Commands
Master these — they're your daily companions:
# Cluster health
crsctl check cluster -all
crsctl status resource -t
# Database management
srvctl status database -d MYDB
srvctl start database -d MYDB
srvctl stop database -d MYDB -o immediate
srvctl config database -d MYDB
# Instance management
srvctl start instance -d MYDB -i MYDB1
srvctl stop instance -d MYDB -i MYDB2 -o transactional
# Services
srvctl add service -d MYDB -s OLTP -preferred MYDB1,MYDB2
srvctl start service -d MYDB -s OLTP
srvctl status service -d MYDB
# SCAN and listeners
srvctl status scan
srvctl status scan_listener
lsnrctl status LISTENER_SCAN1
# ASM
asmcmd lsdg
sqlplus / as sysasm
SELECT name, state, total_mb, free_mb FROM v$asm_diskgroup;
9. Patching RAC: Rolling Updates
One of RAC's biggest wins — apply patches one node at a time with zero downtime:
- Drain connections from node 1 (relocate services)
- Stop services and instance on node 1
- Apply OPatch / OPatchAuto on node 1
- Start node 1, relocate services back
- Repeat for node 2 (and others)
Always test patches in non-prod RAC first. Quarterly Release Updates (RU) are recommended.
10. RAC Best Practices from 18 Years of Production
- Always use ASM — not raw devices or NFS for datafiles. ASM Filter Driver (AFD) is preferred over ASMLib in 19c.
- Separate networks: Public + Private (interconnect) on different switches
- Use private interconnect bonding for redundancy (HAIP gives you this)
- Hugepages: Configure for the entire SGA — improves performance significantly
- Time sync: Mandatory. Drift causes weird evictions. Use chrony.
- Use services: Never connect to instances directly. Services give you transparent failover (TAF/FAN/AC).
- Monitor cluster events: Set up Enterprise Manager or custom alerting on
crsctl eventsandcluster_interconnectshealth. - Document your topology: Node names, IPs, VIPs, SCAN, disk groups, services — keep current diagrams.
- Practice failover drills quarterly. Confidence comes from rehearsal.
11. Common RAC Issues & How to Fix Them
- Node Eviction: Usually a network or storage issue. Check
/var/log/messages,ocssd.log, network ping times. - Slow Cache Fusion: Check interconnect latency (
oradebug ipc). Look for "gc cr block 2-way" wait events. - Voting Disk Loss: If quorum lost, cluster shuts down. Restore from OCR backup with
crsctl replace votedisk. - ASM Imbalanced: Run
ALTER DISKGROUP DATA REBALANCE POWER 8;during low-load window. - OPatch failures: Always backup ORACLE_HOME with
opatch lsinventorysnapshot before patching.
12. When NOT to Use RAC
RAC is powerful but not always the right answer:
- Single-app workloads that don't need HA — overkill
- Limited budget — RAC requires shared storage, more licenses, complex setup
- Without skilled DBA — RAC needs expertise to operate safely
- For DR-only — use Data Guard instead
Best fit: mission-critical OLTP systems where downtime costs $$$ per minute — banks, telcos, airlines, ERP backends.
Final Thoughts
Oracle 19c RAC is enterprise-grade software for enterprise-grade workloads. The complexity is real — but so is the payoff. A properly-architected, well-monitored RAC cluster delivers years of uninterrupted service. The mistakes happen when teams treat RAC like "just two databases" — it isn't. It's a single distributed system with its own behaviors, failure modes, and recovery patterns.
If your organization is planning a RAC deployment, migration, or troubleshooting an existing cluster, let's talk. I've architected and operated production RAC clusters across multiple industries — banking, pharma, software, manufacturing — and bring the rare combination of practical scars and architectural clarity that RAC demands.
🔗 Need RAC Setup or Support?
Oracle RAC configuration, troubleshooting, performance tuning, and migration. Free 30-minute consultation.