📍 New Eskaton, Dhaka-1000

Oracle 19c Data Guard: The Complete Disaster Recovery Guide

When my pharma client's primary data center suffered a catastrophic flooding incident in 2023, the entire business stayed online — because Data Guard had been doing its quiet job for three years before disaster struck. Oracle Data Guard is the difference between business continuity and business collapse. In this guide, I'll share everything you need to architect, deploy, and operate Oracle 19c Data Guard for real-world disaster recovery.

1. What Is Oracle Data Guard?

Oracle Data Guard is a feature of Oracle Database Enterprise Edition that maintains one or more standby databases as transactionally-consistent copies of a primary database. The standby is continuously updated by shipping redo data from the primary across the network. If the primary fails, the standby is activated — minimizing data loss and downtime.

The big idea: RAC protects against node failures. Data Guard protects against site/data-center failures.

2. Why You Absolutely Need Data Guard

  • Disaster Recovery: Survive data center fires, floods, ransomware, regional outages
  • Data Protection: Recover from logical corruption that backups would only mask
  • High Availability: Sub-minute failover capability
  • Reporting offload: Active Data Guard enables read-only queries on standby — relieves primary
  • Migration: Move databases across platforms/data centers with minimal downtime
  • Rolling Upgrades: Transient logical standby enables near-zero-downtime version upgrades

3. Types of Standby Databases

3.1 Physical Standby

An exact block-by-block copy of the primary. Uses Redo Apply (Managed Recovery Process - MRP) to apply redo data. Most common type. Identical schema, identical data, ready for failover.

3.2 Logical Standby

Uses SQL Apply (LogMiner-based) to convert redo into SQL statements and execute them. Allows additional indexes, materialized views, and even some schema changes. Open for read/write on objects not maintained by Data Guard. Used for reporting + selective replication.

3.3 Snapshot Standby

A physical standby temporarily converted to read/write for testing. Receives redo but doesn't apply it. Convert back to physical standby anytime — all test changes discarded. Excellent for QA/UAT against production-fresh data.

3.4 Active Data Guard (Paid Option)

A physical standby that is open in read-only mode while still receiving and applying redo. Game-changer for offloading reporting workloads. Also enables: standby block change tracking, real-time query, automatic block repair, Far Sync instances.

4. Data Guard Protection Modes

Choose based on your tolerance for data loss vs. performance:

4.1 Maximum Performance (default)

  • Redo shipped asynchronously (ASYNC)
  • Primary commits before standby acknowledgment
  • Potential for small data loss (seconds) in disaster
  • Zero performance impact on primary
  • Best for: Most production workloads

4.2 Maximum Availability

  • Redo shipped synchronously (SYNC)
  • Primary waits for standby acknowledgment before commit
  • Zero data loss when standby is in sync
  • If standby unreachable, primary continues (downgrades to MAX PERFORMANCE temporarily)
  • Best for: Financial/regulated workloads needing zero data loss

4.3 Maximum Protection

  • Synchronous + strict
  • If standby unreachable, primary SHUTS DOWN to prevent any data loss
  • Rarely used — operational risk too high
  • Best for: Defense/regulated environments where data loss is unacceptable at any cost

5. Data Guard Architecture

Key processes:

  • LGWR (Primary): Writes redo to local redo logs
  • LNS (Network Server): Ships redo to standby (SYNC or ASYNC)
  • RFS (Remote File Server): Receives redo on standby
  • Standby Redo Logs (SRL): Where RFS writes incoming redo
  • MRP (Managed Recovery Process): Applies redo on physical standby
  • FSFO Observer: External monitor for automatic failover

6. Setting Up Data Guard — Step-by-Step

High-level procedure to add a physical standby:

6.1 Preparation

  1. Enable archivelog mode on primary: ALTER DATABASE ARCHIVELOG;
  2. Enable forced logging: ALTER DATABASE FORCE LOGGING;
  3. Create standby redo logs (one more than online redo log groups)
  4. Set primary parameters: DB_UNIQUE_NAME, LOG_ARCHIVE_CONFIG, LOG_ARCHIVE_DEST_2, FAL_SERVER, STANDBY_FILE_MANAGEMENT
  5. Configure TNS entries on both servers
  6. Copy password file to standby

6.2 Create Physical Standby (RMAN Duplicate)

rman target sys/password@PRIMARY auxiliary sys/password@STANDBY

DUPLICATE TARGET DATABASE FOR STANDBY
  FROM ACTIVE DATABASE
  DORECOVER
  SPFILE
  NOFILENAMECHECK;

6.3 Start Managed Recovery

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
  USING CURRENT LOGFILE DISCONNECT FROM SESSION;

6.4 Verify Sync

-- On primary
SELECT thread#, max(sequence#) FROM v$archived_log GROUP BY thread#;

-- On standby
SELECT thread#, max(sequence#) FROM v$archived_log
WHERE applied='YES' GROUP BY thread#;

-- Apply lag (Active DG)
SELECT name, value, time_computed FROM v$dataguard_stats;

7. Data Guard Broker — Use It!

The Data Guard Broker (DGMGRL) is the management framework. Always configure it. It simplifies switchover, monitors health, manages protection modes, and enables fast-start failover. After initial setup:

dgmgrl sys/password
DGMGRL> CREATE CONFIGURATION 'DG_CONFIG'
        AS PRIMARY DATABASE IS 'PROD'
        CONNECT IDENTIFIER IS 'PROD';
DGMGRL> ADD DATABASE 'PROD_DR' AS
        CONNECT IDENTIFIER IS 'PROD_DR' MAINTAINED AS PHYSICAL;
DGMGRL> ENABLE CONFIGURATION;
DGMGRL> SHOW CONFIGURATION;

8. Switchover vs Failover — Know the Difference

Switchover (Planned)

  • Graceful role reversal: primary becomes standby, standby becomes primary
  • Zero data loss
  • Used for planned maintenance, DR testing, data center moves
  • Command: DGMGRL> SWITCHOVER TO 'PROD_DR';

Failover (Unplanned)

  • Primary lost — promote standby to primary
  • Potential data loss depending on protection mode
  • Old primary must be re-instantiated as new standby (flashback database helps)
  • Command: DGMGRL> FAILOVER TO 'PROD_DR';

Fast-Start Failover (FSFO)

Automatic failover triggered by external Observer process. Tunable threshold (default 30 seconds). Best practice: run Observer on a third site, not on primary or standby.

9. Active Data Guard Use Cases

  • Reporting offload: Run analytics on standby, freeing primary for OLTP
  • Real-time query: ETL/dashboards read fresh data from standby
  • Automatic block repair: Corrupted block on primary auto-fetched from standby
  • RMAN backups from standby: Reduce I/O load on primary
  • DML redirection (19c+): Limited DML on standby auto-forwarded to primary

10. Monitoring Data Guard Health

Critical queries to run daily:

-- Transport lag
SELECT name, value FROM v$dataguard_stats
WHERE name IN ('transport lag', 'apply lag', 'apply finish time');

-- MRP status
SELECT process, status, sequence# FROM v$managed_standby
WHERE process='MRP0';

-- Gap check
SELECT * FROM v$archive_gap;

-- Broker status
DGMGRL> SHOW CONFIGURATION VERBOSE;
DGMGRL> SHOW DATABASE 'PROD_DR' STATUSREPORT;

11. Data Guard Best Practices from Real Production

  • Use Data Guard Broker. Manually managing parameters is error-prone.
  • Configure Standby Redo Logs (SRL) — one more group than primary, same size.
  • Set DB_UNIQUE_NAME differently on each member (e.g., PROD, PROD_DR).
  • Use SCAN/Service connection not specific IPs in TNS entries.
  • Enable Flashback Database — critical for re-instantiation after failover.
  • Test DR drills quarterly. Failover that's never practiced is failover that doesn't work.
  • Monitor lag continuously — alert if >5 minutes apply lag.
  • Use compression for redo over WAN: compression=enable in LOG_ARCHIVE_DEST_2.
  • Document recovery procedures — runbook with exact commands for failover/switchover.
  • Far Sync instance for zero-data-loss across long distances (lightweight redo receiver).

12. Common Issues & Solutions

  • Apply lag growing: Check standby I/O, MRP processes, network bandwidth. Increase parallel apply slaves.
  • Archive gap: Verify FAL_SERVER configured. Use ALTER DATABASE REGISTER LOGFILE for manual recovery.
  • ORA-16957 (apply lag exceeded): Network or standby performance issue.
  • Standby out of sync after primary restart: Resync via incremental backup-from-SCN or full duplicate.
  • FSFO not triggering: Check Observer is running, thresholds configured, FastStartFailoverThreshold set correctly.

13. Data Guard + RAC = Maximum Availability Architecture

For mission-critical systems, deploy RAC + Data Guard together. Primary site: 2-node RAC for HA. DR site: 2-node RAC standby (or single instance). This is what Oracle calls MAA — Maximum Availability Architecture. It survives node failures, site failures, and even simultaneous failures.

Final Thoughts

Data Guard isn't optional infrastructure — it's survival infrastructure. The right time to deploy Data Guard is before you need it. Every organization I've worked with that had Data Guard in place when disaster struck looks back at it as the single best technology investment they made. Every organization that didn't... has stories I won't repeat.

If you need help designing, deploying, or testing an Oracle 19c Data Guard configuration — whether single-instance, RAC, or full MAA topology — I'm here to help. I've architected Data Guard for banks, pharma, software firms, and trading houses, and have walked clients through both planned switchovers and real-world failures. Don't wait for a disaster to take Data Guard seriously.

🛡️ Need Data Guard / DR Setup?

Disaster recovery planning, Data Guard configuration, switchover testing, and failover drills. Free consultation.

📩 Free Consultation View Pricing

Related Articles

💬