How to Build an AI Assistant on Your ERP Data Using Oracle 26ai — RAG Pattern & Vector Search Guide
Your ERP holds years of your business's most important data — purchase orders, inventory movements, production batches, sales history, supplier records, financial transactions. But getting answers from it today means writing a SQL report, waiting for IT, or running a slow manual query. Oracle Database 26ai changes this. With in-database vector search and the RAG (Retrieval-Augmented Generation) pattern, you can build an AI assistant that lets your team ask questions in plain English — "What was our top-selling product in Dhaka last quarter?" — and get accurate, sourced answers from your real ERP data in seconds. No data ever leaves your Oracle database.
1. The Problem: ERP Data Is Locked Behind Queries
Every organization I work with has the same frustration. The data is there — in the Oracle tables — but accessing it requires technical skills most users don't have. A purchasing manager wants to know which supplier has the best on-time delivery rate over the past year. A finance director wants to see which product lines are dragging down margin. A production head wants to know the average downtime per shift over the last quarter.
Each of these questions requires a custom SQL report. That means a ticket to IT, a waiting queue, and a report that is already stale by the time it arrives. And if the question changes slightly — "same analysis but only for Chittagong warehouse" — the cycle starts again.
The alternative — giving users access to write their own SQL — creates security and data integrity risks no organization wants. The real solution is an AI layer that understands natural language questions and translates them into accurate, governed database queries, using only the data the user is authorized to see.
2. What Is RAG (Retrieval-Augmented Generation)?
RAG is the architecture pattern that makes AI assistants accurate instead of just fluent. Without RAG, a Large Language Model (LLM) answers questions from its training data — which is general world knowledge, not your company's ERP records. With RAG, the AI first retrieves relevant data from your database, then uses that retrieved context to generate an answer grounded in your actual business data.
The RAG pattern for ERP data works in three steps:
- User asks a question in plain English: "Which suppliers delivered late more than 3 times this year?"
- The system retrieves the relevant rows from your ERP tables — using vector search to find semantically relevant data, or SQL generation to query structured tables directly
- The AI generates an answer using the retrieved data as context — not from general knowledge, but from your real records
The result: answers that are factually grounded in your data, traceable to specific records, and scoped to only the data the user is authorized to see.
3. Why Oracle 26ai Is the Right Platform for This
You could build a RAG system using external tools — a Python application, a vector database like Pinecone, an external LLM API. Many organizations do. But for ERP data, this approach has serious problems:
- Data leaves your organization: To use an external LLM API, your data must travel over the internet to a third-party server. For manufacturing data, financial records, or pharma batch data, this is a security and compliance risk
- Data duplication: You maintain a separate vector database alongside your Oracle ERP — synchronization, consistency, and additional infrastructure cost
- Access control complexity: Your Oracle row-level security policies don't automatically apply to an external vector database
Oracle 26ai solves all three problems by bringing AI inside the database:
- VECTOR datatype: Store vector embeddings directly in Oracle tables — same row, same table as your ERP data
- AI Vector Index: Purpose-built index for approximate nearest-neighbor vector search — fast, scalable, in-database
- DBMS_VECTOR package: Generate embeddings inside Oracle using built-in or locally-connected models
- Select AI: Natural language to SQL translation built into Oracle — users type questions, Oracle generates and executes the SQL
- In-database LLM inference: Run inference inside Oracle using ONNX models — no external API call, no data movement
Your Oracle security model — VPD, row-level security, object privileges — applies automatically to all AI queries. The user's AI assistant can only access the data their Oracle account can access.
4. The Architecture: ERP + Oracle 26ai AI Assistant
Here is the full architecture for an ERP AI assistant built on Oracle 26ai. Each component runs inside or alongside your existing Oracle database — no new infrastructure required beyond the AI model files.
User Question (plain English)
↓
[Select AI / NL-to-SQL Layer] ← Oracle 26ai built-in
↓
[Vector Search + SQL Query] ← AI Vector Index + standard Oracle SQL
↓
[ERP Tables in Oracle] ← Your existing data, never moves
↓
[In-Database LLM Inference] ← ONNX model inside Oracle
↓
Answer + Source References ← Grounded in your actual records
The user interacts through a simple web interface — a chat window connected to the Oracle database via a lightweight application layer. Every query is logged. Every answer is traceable to the specific ERP rows that generated it.
5. Step 1 — Enable the VECTOR Datatype and AI Vector Index
Oracle 26ai introduces the native VECTOR datatype for storing embedding vectors directly in Oracle tables. To use it for ERP data, you add a vector column to your existing tables (or a companion shadow table if you prefer not to modify the ERP schema).
-- Add vector embedding column to ERP product table (shadow table approach)
CREATE TABLE erp_product_embeddings (
product_id NUMBER NOT NULL,
product_text VARCHAR2(4000), -- The text that was embedded
embedding VECTOR(1536, FLOAT32), -- 1536-dimension OpenAI-compatible vector
embedded_at DATE DEFAULT SYSDATE,
CONSTRAINT fk_product FOREIGN KEY (product_id)
REFERENCES erp.products(product_id)
);
-- Create AI Vector Index for fast similarity search
CREATE VECTOR INDEX product_vec_idx
ON erp_product_embeddings(embedding)
ORGANIZATION NEIGHBOR PARTITIONS
WITH DISTANCE COSINE
WITH TARGET ACCURACY 95;
6. Step 2 — Generate Embeddings from ERP Data
Embeddings are numerical representations of text that capture semantic meaning. To make ERP data searchable by meaning rather than just by keyword, you convert each record's key fields into an embedding vector using DBMS_VECTOR.
-- Generate embeddings for all products using in-database model
BEGIN
FOR rec IN (
SELECT product_id,
product_name || ' ' || category || ' ' ||
NVL(description, '') || ' ' ||
NVL(unit_of_measure, '') AS product_text
FROM erp.products
WHERE active_flag = 'Y'
) LOOP
INSERT INTO erp_product_embeddings
(product_id, product_text, embedding)
VALUES (
rec.product_id,
rec.product_text,
DBMS_VECTOR.UTL_TO_EMBEDDING(
rec.product_text,
JSON('{"provider":"database","model":"my_onnx_model"}')
)
);
END LOOP;
COMMIT;
END;
/
This process runs entirely inside Oracle — no data is sent to any external service. The ONNX embedding model is loaded into the Oracle database once during setup and runs locally on your Oracle server.
7. Step 3 — Configure Select AI for Natural Language SQL
Select AI is Oracle 26ai's built-in natural language to SQL feature. Once configured with your ERP schema metadata, it translates user questions into executable SQL — automatically, inside the database.
-- Create a Select AI profile for the ERP schema
BEGIN
DBMS_CLOUD_AI.CREATE_PROFILE(
profile_name => 'ERP_AI_PROFILE',
attributes => '{
"provider" : "database",
"model" : "my_local_llm_model",
"object_list" : [
{"owner": "ERP", "name": "PRODUCTS"},
{"owner": "ERP", "name": "ORDERS"},
{"owner": "ERP", "name": "ORDER_LINES"},
{"owner": "ERP", "name": "SUPPLIERS"},
{"owner": "ERP", "name": "INVENTORY"},
{"owner": "ERP", "name": "PURCHASE_ORDERS"}
]
}'
);
END;
/
-- Test: ask a natural language question
SELECT DBMS_CLOUD_AI.GENERATE(
prompt => 'Which 5 products had the highest sales value last month?',
profile_name => 'ERP_AI_PROFILE',
action => 'narrate'
) AS answer
FROM DUAL;
Oracle translates the question into a SQL query against your ERP tables, executes it, and returns the answer as natural language — all inside the database engine.
8. Step 4 — Vector Search for Unstructured ERP Content
Not all ERP data is neatly structured in columns. Supplier notes, quality observations, production remarks, and customer complaint records are often stored as free text. Vector search retrieves relevant text passages even when the user's question uses different words.
-- Find quality observations semantically similar to a user's question
SELECT
qo.observation_id,
qo.batch_number,
qo.observation_text,
VECTOR_DISTANCE(qoe.embedding, :user_query_vector, COSINE) AS similarity
FROM
quality_observations qo
JOIN quality_obs_embeddings qoe ON qo.observation_id = qoe.observation_id
WHERE
VECTOR_DISTANCE(qoe.embedding, :user_query_vector, COSINE) < 0.3
ORDER BY
similarity ASC
FETCH FIRST 10 ROWS ONLY;
The :user_query_vector parameter is the embedding of the user's question — generated at query time using the same model. This finds the 10 most semantically relevant quality observations, regardless of exact wording.
9. Step 5 — The Governed Access Model
Security is where in-database AI wins decisively over external AI tools. Your existing Oracle security model applies automatically:
9.1 Row-Level Security (VPD)
If a warehouse manager can only query their own warehouse's inventory in the ERP application, the same restriction applies to their AI assistant queries. Oracle's Virtual Private Database policies append WHERE clauses to every query — including AI-generated queries — transparently and automatically.
-- VPD policy: restrict inventory queries to user's assigned warehouse
CREATE OR REPLACE FUNCTION inventory_policy(
schema_name IN VARCHAR2,
table_name IN VARCHAR2
) RETURN VARCHAR2 AS
BEGIN
RETURN 'warehouse_id = SYS_CONTEXT(''USERENV'', ''CLIENT_INFO'')';
END;
/
DBMS_RLS.ADD_POLICY(
object_schema => 'ERP',
object_name => 'INVENTORY',
policy_name => 'WAREHOUSE_FILTER',
function_schema=> 'ERP',
policy_function=> 'INVENTORY_POLICY'
);
9.2 Column-Level Masking
Sensitive columns — employee salaries, supplier pricing, customer credit limits — can be masked or hidden from specific user roles using Oracle Data Masking or column-level privileges. The AI assistant respects these controls: if a user cannot SELECT a column directly, the AI cannot return it in an answer either.
9.3 Query Audit Trail
Every AI query is logged: who asked what, when, what SQL was generated, and what rows were accessed. This audit trail supports compliance requirements and security review — especially important in pharma, banking, and regulated manufacturing environments.
10. Use Cases by ERP Module
10.1 Inventory & Warehouse
- "What items are below reorder level in the Gazipur warehouse right now?"
- "Which 10 products have had no movement in the last 60 days?"
- "What is the total stock value by category?"
10.2 Procurement & Suppliers
- "Which suppliers delivered late more than twice this quarter?"
- "What is the average lead time for API raw materials?"
- "Show me all purchase orders over BDT 5 lakh pending approval"
10.3 Sales & Revenue
- "What were the top 5 products by revenue in Chittagong last month?"
- "Which customers haven't placed an order in 90 days?"
- "Compare this month's sales to the same month last year by product line"
10.4 Manufacturing & Production
- "What is the average batch yield for Product X over the last 6 months?"
- "Which production lines had the most downtime this week?"
- "Are there any batches still in QC hold from last month?"
10.5 Finance & Accounts
- "What is the total outstanding receivables from distributors?"
- "Which cost centers exceeded budget this quarter?"
- "Show me the trend of raw material costs over the last 12 months"
11. How This Compares to External AI Tools
Many organizations are tempted to use ChatGPT, Gemini, or other external AI tools for business data analysis. Here is why that approach fails for ERP data:
| Factor | External AI (ChatGPT etc.) | Oracle 26ai In-Database |
|---|---|---|
| Data privacy | Data sent to external server | Data never leaves Oracle |
| Accuracy | Hallucinations on your data | Answers from actual records |
| Access control | No Oracle security policies | VPD + roles apply automatically |
| Data freshness | Stale export / CSV upload | Live production data |
| Audit trail | No query logging | Full Oracle audit trail |
| Compliance | Fails pharma / banking rules | Satisfies regulated industry needs |
12. Implementation Approach
A practical Oracle 26ai ERP AI assistant implementation typically runs in three phases:
Phase 1 — Foundation (Weeks 1–3)
- Oracle 26ai installation or upgrade from existing version
- ONNX embedding model load into Oracle
- Schema analysis — identify the 5–8 most-queried ERP tables
- Vector embedding generation for key data
- Select AI profile configuration for structured queries
- Basic security model review
Phase 2 — AI Assistant Interface (Weeks 4–6)
- Lightweight web interface — chat window connected to Oracle
- User authentication tied to Oracle accounts
- Query routing: structured questions → Select AI, semantic search → vector index
- Answer formatting and source citation
- Query audit logging
Phase 3 — Rollout & Tuning (Weeks 7–8)
- Pilot with 5–10 key users (one per department)
- Embedding refresh schedule for live data
- Fine-tuning query templates for common question patterns
- User training (typically 1–2 hours)
- Handover documentation
13. What You Need to Get Started
The minimum requirements for an Oracle 26ai ERP AI assistant:
- Oracle Database 26ai (or upgrade from 19c / 21c — see our upgrade guide)
- Enterprise Edition license — required for AI Vector Index and VPD
- Server resources: Additional RAM for in-database model inference (typically 16–32 GB extra, depending on model size)
- ONNX embedding model: An open-source model (e.g., all-MiniLM-L6-v2 at 23MB) or a larger model for higher accuracy
- ERP schema documentation: Table and column descriptions improve Select AI accuracy significantly
🤖 Ready to Build an AI Assistant on Your ERP Data?
I design and implement Oracle 26ai AI assistants for ERP environments in Bangladesh and worldwide — from vector setup to governed access model to user interface. Your data stays in your Oracle database throughout.
Final Thoughts
The promise of AI for business data has always been: ask a question, get an answer. For years, that promise broke on the reality of data privacy, security, and accuracy. External AI tools could answer questions about the world but not reliably about your ERP — and certainly not without your data leaving your control.
Oracle 26ai's in-database AI architecture changes the equation fundamentally. Vectors live in your Oracle tables. Inference runs on your Oracle server. Security policies apply automatically. The result is an AI assistant that is simultaneously powerful, private, accurate, and compliant — without a second infrastructure stack to manage.
For manufacturing organizations, pharma companies, banks, and trading businesses sitting on years of Oracle ERP data, this is not a future technology. It is available today, and the implementation path is straightforward for any organization running a modern Oracle Database.
References & Further Reading
- 📄 Oracle AI Vector Search — Overview and Architecture (Oracle Docs)
- 📄 Oracle Select AI — Natural Language to SQL (Oracle Docs)
- 📄 Oracle AI Vector Search — Retrieval Augmented Generation (RAG)
- 📄 Oracle Database Concepts — Virtual Private Database (VPD)
This article is based on hands-on Oracle 26ai implementation experience and Oracle's official AI vector search and Select AI documentation.
