Oracle Database Health Check: What to Inspect, What to Fix, and What It Costs to Ignore
Most Oracle databases in production have at least one serious problem their DBA does not know about. Not because the DBA is incompetent — but because without a structured health check, the problems that grow slowly never trigger an alert until they become a crisis. A backup that has been silently failing for three weeks. A tablespace at 94% capacity. An index that was dropped during a "quick fix" six months ago and never rebuilt, causing every related query to do a full table scan. A user account that belonged to an employee who left the company a year ago, still active with DBA privileges. This guide walks through every category a proper Oracle health check must cover — with the SQL queries to run it yourself.
1. When Do You Need a Health Check?
A health check is not just for when something is obviously wrong. The organizations that benefit most from health checks are the ones where everything appears to be running fine — because that is exactly when hidden problems are accumulating undetected.
Specific triggers that should prompt an immediate health check:
- New DBA taking over a database they did not build
- Upcoming compliance audit or regulatory inspection
- Application going live on an Oracle database
- Performance degrading gradually over weeks or months
- Before a major Oracle version upgrade
- After a hardware change, storage migration, or server replacement
- Before signing a new IT service contract — understand what you are inheriting
- Annually as routine preventive maintenance — even when everything seems fine
2. Category 1 — Performance
Performance problems rarely appear suddenly. They grow gradually — a query that took 2 seconds a year ago now takes 12 seconds, but nobody noticed because the change was slow. A health check finds these before users start complaining.
2.1 Top Wait Events
Wait events tell you what Oracle is waiting for most. The top 5 wait events explain most performance problems.
-- Top 10 wait events (last 7 days from AWR)
SELECT event, total_waits, time_waited_micro/1000000 AS time_waited_sec,
ROUND(time_waited_micro / SUM(time_waited_micro) OVER() * 100, 1) AS pct_total
FROM (
SELECT event, SUM(total_waits) AS total_waits,
SUM(time_waited_micro) AS time_waited_micro
FROM DBA_HIST_SYSTEM_EVENT
WHERE snap_id IN (
SELECT snap_id FROM DBA_HIST_SNAPSHOT
WHERE begin_interval_time > SYSDATE - 7
)
AND wait_class != 'Idle'
GROUP BY event
)
ORDER BY time_waited_micro DESC
FETCH FIRST 10 ROWS ONLY;
2.2 Top SQL by Elapsed Time
The top 10 SQL statements by elapsed time are responsible for most of the database's workload. Optimizing even one of these can have a dramatic impact on overall performance.
-- Top 10 SQL statements by elapsed time (AWR, last 7 days)
SELECT sql_id,
ROUND(elapsed_time_total/1000000,1) AS elapsed_sec,
executions_total,
ROUND(elapsed_time_total/NULLIF(executions_total,0)/1000000,3) AS avg_sec,
SUBSTR(sql_text,1,80) AS sql_preview
FROM DBA_HIST_SQLSTAT s
JOIN DBA_HIST_SQLTEXT t USING (sql_id)
WHERE snap_id IN (
SELECT snap_id FROM DBA_HIST_SNAPSHOT
WHERE begin_interval_time > SYSDATE - 7
)
ORDER BY elapsed_time_total DESC
FETCH FIRST 10 ROWS ONLY;
2.3 Missing Indexes (Full Table Scans on Large Tables)
-- Large tables with high full table scan counts
SELECT object_name, value AS full_scans
FROM V$SEGMENT_STATISTICS
WHERE statistic_name = 'table scans (long tables)'
AND value > 100
ORDER BY value DESC
FETCH FIRST 20 ROWS ONLY;
2.4 Buffer Cache Hit Ratio
-- Buffer cache hit ratio (should be > 95% for OLTP)
SELECT ROUND(
(1 - (physical_reads / (db_block_gets + consistent_gets))) * 100, 2
) AS buffer_hit_pct
FROM (
SELECT SUM(DECODE(name,'physical reads',value)) AS physical_reads,
SUM(DECODE(name,'db block gets',value)) AS db_block_gets,
SUM(DECODE(name,'consistent gets',value)) AS consistent_gets
FROM V$SYSSTAT
WHERE name IN ('physical reads','db block gets','consistent gets')
);
3. Category 2 — Backup & Recovery Readiness
This is the most critical category. A database with a broken or untested backup is one incident away from data loss that cannot be recovered.
3.1 Last Successful Backup
-- When was the last successful RMAN backup?
SELECT INPUT_TYPE, STATUS,
TO_CHAR(START_TIME,'DD-MON-YYYY HH24:MI') AS started,
TO_CHAR(END_TIME,'DD-MON-YYYY HH24:MI') AS ended,
ROUND(OUTPUT_BYTES/1024/1024/1024,2) AS output_gb
FROM V$RMAN_BACKUP_JOB_DETAILS
WHERE STATUS = 'COMPLETED'
ORDER BY START_TIME DESC
FETCH FIRST 5 ROWS ONLY;
3.2 Archive Log Gap Check
-- Are there archive log sequence gaps? (gaps = unrecoverable periods)
SELECT thread#, MIN(sequence#) AS min_seq, MAX(sequence#) AS max_seq,
MAX(sequence#) - MIN(sequence#) + 1 AS expected,
COUNT(*) AS actual,
CASE WHEN MAX(sequence#) - MIN(sequence#) + 1 = COUNT(*)
THEN 'NO GAPS' ELSE 'GAPS FOUND' END AS status
FROM V$ARCHIVED_LOG
WHERE first_time > SYSDATE - 7
AND standby_dest = 'NO'
GROUP BY thread#;
3.3 FRA Usage
-- FRA usage (> 80% is a warning, > 90% is critical)
SELECT NAME,
ROUND(SPACE_LIMIT/1024/1024/1024,1) AS limit_gb,
ROUND(SPACE_USED/1024/1024/1024,1) AS used_gb,
ROUND(SPACE_USED/SPACE_LIMIT*100,1) AS pct_used,
CASE WHEN SPACE_USED/SPACE_LIMIT > 0.9 THEN 'CRITICAL'
WHEN SPACE_USED/SPACE_LIMIT > 0.8 THEN 'WARNING'
ELSE 'OK' END AS status
FROM V$RECOVERY_FILE_DEST;
4. Category 3 — Storage & Space Management
Tablespace full is one of the most common causes of unplanned Oracle outages — and one of the most preventable. An automated space check should run daily.
4.1 Tablespace Space Usage
-- Tablespace usage with autoextend status
SELECT t.tablespace_name,
ROUND(t.bytes/1024/1024/1024,2) AS total_gb,
ROUND((t.bytes - f.free)/1024/1024/1024,2) AS used_gb,
ROUND(f.free/1024/1024/1024,2) AS free_gb,
ROUND((t.bytes - f.free)/t.bytes*100,1) AS pct_used,
CASE WHEN (t.bytes - f.free)/t.bytes > 0.90 THEN 'CRITICAL'
WHEN (t.bytes - f.free)/t.bytes > 0.80 THEN 'WARNING'
ELSE 'OK' END AS status
FROM (SELECT tablespace_name, SUM(bytes) AS bytes
FROM DBA_DATA_FILES GROUP BY tablespace_name) t
JOIN (SELECT tablespace_name, SUM(bytes) AS free
FROM DBA_FREE_SPACE GROUP BY tablespace_name) f
ON t.tablespace_name = f.tablespace_name
ORDER BY pct_used DESC;
4.2 Top 10 Largest Segments
-- Top 10 largest objects consuming space
SELECT owner, segment_name, segment_type,
ROUND(bytes/1024/1024/1024,3) AS size_gb
FROM DBA_SEGMENTS
ORDER BY bytes DESC
FETCH FIRST 10 ROWS ONLY;
4.3 Redo Log Sizing
-- Redo log switch frequency (should be < 4 switches/hour for OLTP)
SELECT TO_CHAR(first_time,'YYYY-MM-DD HH24') AS hour_block,
COUNT(*) AS log_switches
FROM V$LOG_HISTORY
WHERE first_time > SYSDATE - 3
GROUP BY TO_CHAR(first_time,'YYYY-MM-DD HH24')
ORDER BY 1 DESC;
More than 4 log switches per hour usually means redo logs are undersized — Oracle is writing to new log files too frequently, causing I/O pressure and checkpoint activity.
5. Category 4 — Security & Access Control
Security findings are consistently among the most serious in any Oracle health check. Unauthorized access, excessive privileges, and dormant accounts are all common — and all carry significant risk.
5.1 Users with DBA Role
-- Who has DBA privileges? (should be a very short list)
SELECT grantee, granted_role, admin_option, default_role
FROM DBA_ROLE_PRIVS
WHERE granted_role = 'DBA'
ORDER BY grantee;
5.2 Users with SYSDBA / SYSOPER Privilege
-- Users with SYSDBA or SYSOPER (the highest Oracle privileges)
SELECT * FROM V$PWFILE_USERS
ORDER BY username;
5.3 Dormant Accounts (Not Logged In for 90+ Days)
-- User accounts not used in 90+ days — high risk, should be reviewed
SELECT username, account_status,
TO_CHAR(last_login,'DD-MON-YYYY') AS last_login,
ROUND(SYSDATE - CAST(last_login AS DATE)) AS days_inactive,
profile
FROM DBA_USERS
WHERE account_status = 'OPEN'
AND (last_login IS NULL OR last_login < SYSDATE - 90)
AND username NOT IN ('SYS','SYSTEM','DBSNMP','APPQOSSYS',
'AUDSYS','GSMADMIN_INTERNAL')
ORDER BY last_login NULLS FIRST;
5.4 Default Passwords on Standard Oracle Accounts
-- Check if any Oracle default accounts still have default passwords
SELECT username, account_status
FROM DBA_USERS_WITH_DEFPWD
WHERE account_status = 'OPEN'
ORDER BY username;
5.5 Audit Trail Status
-- Is auditing enabled? What is being audited?
SELECT name, value FROM V$PARAMETER
WHERE name IN ('audit_trail','unified_auditing','audit_sys_operations');
-- How many audit records accumulated (large number = needs housekeeping)
SELECT COUNT(*) AS audit_record_count FROM UNIFIED_AUDIT_TRAIL
WHERE EVENT_TIMESTAMP > SYSDATE - 30;
6. Category 5 — Invalid Objects
Invalid objects — stored procedures, functions, triggers, views, packages — cause application errors and performance problems. They are frequently left behind after schema changes and patch applications.
-- Count invalid objects by owner and type
SELECT owner, object_type, COUNT(*) AS invalid_count
FROM DBA_OBJECTS
WHERE status = 'INVALID'
AND owner NOT IN ('SYS','SYSTEM','AUDSYS','DBSNMP','APPQOSSYS')
GROUP BY owner, object_type
ORDER BY invalid_count DESC;
-- List all invalid objects (for remediation)
SELECT owner, object_name, object_type,
TO_CHAR(last_ddl_time,'DD-MON-YYYY HH24:MI') AS last_changed
FROM DBA_OBJECTS
WHERE status = 'INVALID'
AND owner NOT IN ('SYS','SYSTEM','AUDSYS','DBSNMP','APPQOSSYS')
ORDER BY owner, object_type, object_name;
After a patch or schema change, recompile invalid objects:
-- Recompile all invalid objects (run as DBA)
EXEC UTL_RECOMP.RECOMP_PARALLEL(4);
-- Or recompile a specific schema
EXEC UTL_RECOMP.RECOMP_SERIAL('ERP_OWNER');
7. Category 6 — Configuration & Parameter Review
Oracle initialization parameters set at installation are rarely revisited. As the database grows and workload changes, parameters that were appropriate at go-live can become bottlenecks years later.
7.1 Key Parameters to Review
-- Key parameters affecting performance and stability
SELECT name, value, description
FROM V$PARAMETER
WHERE name IN (
'sga_target', -- Total SGA size
'pga_aggregate_target', -- Total PGA size
'db_cache_size', -- Buffer cache
'shared_pool_size', -- Shared pool
'db_block_size', -- Block size (fixed at creation)
'processes', -- Max concurrent processes
'sessions', -- Max concurrent sessions
'open_cursors', -- Max open cursors per session
'undo_retention', -- Undo retention period (seconds)
'log_buffer', -- Redo log buffer size
'parallel_max_servers', -- Max parallel query servers
'db_file_multiblock_read_count', -- Full scan efficiency
'optimizer_mode', -- ALL_ROWS, FIRST_ROWS, etc.
'cursor_sharing' -- EXACT, FORCE, SIMILAR
)
ORDER BY name;
7.2 Memory Advisor Recommendations
-- SGA component sizing recommendations from Memory Advisor
SELECT component, current_size/1024/1024 AS current_mb,
min_size/1024/1024 AS min_mb,
max_size/1024/1024 AS max_mb
FROM V$SGA_DYNAMIC_COMPONENTS
WHERE current_size > 0
ORDER BY current_size DESC;
7.3 Non-Default Parameters (Customized Settings)
-- Parameters changed from Oracle defaults (review each one)
SELECT name, value, default_value, description
FROM V$PARAMETER
WHERE isdefault = 'FALSE'
AND name NOT IN ('db_name','db_unique_name','control_files',
'db_recovery_file_dest','db_recovery_file_dest_size')
ORDER BY name;
8. Category 7 — High Availability Readiness
Many organizations believe they have HA because they have RMAN backups. A backup is not HA — it is recovery. This category checks whether the database is configured for actual availability.
8.1 Database Mode and Archiving
-- Database must be in ARCHIVELOG mode for online backups and DR
SELECT name, log_mode, open_mode, protection_mode,
database_role, db_unique_name
FROM V$DATABASE;
If LOG_MODE = NOARCHIVELOG — this is a critical finding. The database cannot be backed up online and cannot support Data Guard standby.
8.2 Data Guard Status (If Configured)
-- Check Data Guard configuration status
SELECT database_role, protection_mode, protection_level,
switchover_status, dataguard_broker
FROM V$DATABASE;
-- Check standby apply lag
SELECT name, value, datum_time
FROM V$DATAGUARD_STATS
WHERE name IN ('transport lag','apply lag','estimated startup time');
8.3 Flash Recovery Area Health
-- Check all FRA files and their status
SELECT file_type, percent_space_used, percent_space_reclaimable,
number_of_files
FROM V$RECOVERY_AREA_USAGE
ORDER BY percent_space_used DESC;
9. What a Professional Health Check Delivers
Running the queries above gives you the raw data. A professional health check turns that data into a prioritized action plan. The deliverable from my Oracle health check engagement includes:
- Executive Summary: Overall database health rating (Green / Amber / Red) per category — readable by management without technical background
- Critical Findings: Issues that require immediate action before they cause an incident — ranked by risk
- Performance Analysis: Top wait events, worst-performing SQL, and specific tuning recommendations with expected impact
- Backup Assessment: Honest evaluation of backup completeness and recovery capability — including actual recovery time estimate
- Security Report: All privilege violations, dormant accounts, and default passwords — with remediation steps
- Space Forecast: Tablespace growth projection — how many months until each tablespace hits critical capacity
- Configuration Recommendations: Parameter changes with expected performance impact
- Invalid Objects List: Complete list of invalid schema objects and remediation scripts
- 2-Hour Follow-Up Call: Walk through findings, answer questions, prioritize remediation with your team
10. Common Findings in Bangladesh Oracle Environments
After performing health checks across manufacturing, pharma, banking, and software organizations in Bangladesh, these are the findings I encounter most frequently:
- NOARCHIVELOG mode: Database cannot be backed up online or support DR — immediate risk
- Backup to same storage as database: Single point of failure for both data and backup
- Multiple users sharing one Oracle account: Audit trail is useless — cannot attribute changes to individuals
- Tablespace autoextend unlimited: Can consume all disk space without warning — filesystem fills, server crashes
- SGA undersized for workload: Excessive physical reads — database working much harder than it needs to
- Dozens of invalid objects: Application procedures failing silently or running degraded
- No AWR baselines: No baseline to compare against — impossible to know whether performance has degraded
- FGA / auditing disabled: No audit trail for regulated data — compliance risk in pharma and banking environments
🩺 Get a Professional Oracle Health Check
Full 7-category database audit — performance, backup, storage, security, objects, configuration, and HA readiness. Written report with prioritized findings and a 2-hour follow-up call. Starting from BDT 23,000.
Final Thoughts
An Oracle health check is the DBA equivalent of a medical check-up. Most of the time, nothing dramatic is found. But the value is in the times when something serious is caught early — before the backup fails during a real recovery, before the tablespace fills at midnight, before the dormant DBA account is discovered during a compliance audit.
Every database I have checked has had at least one finding that surprised the existing DBA. Often it is something that has been silently accumulating for months. The cost of catching it in a health check — a few hours of DBA time — is always significantly less than the cost of discovering it during an incident.
If your Oracle database has not had a structured health check in the last 12 months, schedule one. Run the queries in this article as a starting point. And if the results concern you, or if you would rather have an experienced DBA review the findings and give you a prioritized action plan — that is exactly what I offer.
References & Further Reading
- 📄 Oracle Database Performance Tuning Guide — AWR and ADDM (19c)
- 📄 Oracle Dynamic Performance Views Reference (V$ Views)
- 📄 Oracle Database Security Guide — Access Control and Auditing
- 📄 Oracle Database Administrator's Guide — Managing Tablespace Space
The health check queries in this article are drawn from 18+ years of Oracle DBA production practice and Oracle's official diagnostic views documentation.
