This study guide covers all official exam topics for Oracle 1Z0-184-25, with in-depth explanations, code examples (SQL, PL/SQL, Python), and practice questions. The guide is organized according to the main subject areas of the exam and emphasizes Oracle’s latest 23c/23ai features and documentation. Use the structured headings and lists to navigate each topic, and refer to the cited Oracle documentation for authoritative details.
1. Understanding Vector Fundamentals (20%)
Oracle Database 23c introduces native vector data capabilities to store and query high-dimensional vector embeddings for AI/ML applications oneoracledeveloper.comdocs.oracle.com. This section covers the basics of Oracle’s vector data type, how to use it in tables, how to measure similarity with distance functions, and how to manipulate vector data with SQL/PLSQL (DML and DDL operations).
Vector Data Type in Oracle 23c
-
What is the VECTOR data type? – It’s a new data type in Oracle 23c designed for storing vector embeddings, which are arrays of numbers representing unstructured data (text, images, etc.) in a semantic vector space docs.oracle.comdocs.oracle.com. Each vector’s position encodes meaning so the content yields nearby vectors (i.e. small distances in the vector space).
-
Defining vector columns: You can declare a table column as
VECTOR
, optionally specifying the number of dimensions and the numeric format. For example:In the above, every vector stored must have 768 dimensions, each a FLOAT32. You may also omit dimensions/format (using
VECTOR
orVECTOR(*, *)
), which makes the column flexible (can store vectors of varying size/formats) docs.oracle.com. However, mixing dimensions in one column is not useful for similarity search (different models’ embeddings aren’t comparable) docs.oracle.com. -
Dimension and format constraints: Vectors must have at least 1 dimension (up to 65,535 for non-binary formats) docs.oracle.com. Supported element formats are
INT8
(8-bit integer),FLOAT32
(32-bit float),FLOAT64
(64-bit float), andBINARY
(bit-packed binary) docs.oracle.com. For binary vectors, the dimension must be a multiple of 8 (since bits are packed into bytes)docs.oracle.com. -
Dense vs. Sparse storage: By default, vectors are stored in dense form (all dimensions stored). Oracle also supports sparse vectors, optimized for data with many zeros. Declaring
VECTOR(dims, type, SPARSE)
will only store non-zero values, saving space when appropriatedocs.oracle.comdocs.oracle.com. (Sparse vecdocs.oracle.comdocs.oracle.come in the same column or be used with BINARY formatdocs.oracle.com, and currently cannot be declared as PL/SQL variablesdocs.oracle.com.) -
Internal storage: Both dense and sparse vectors are internally stored as SecureFile BLOBsdocs.oracle.com. Typical embedding sizes (hundreds of dimensions) result in a few KB per vector. Oracle provides formulas to estimate storage size for dense and sparse vectorsdocs.oracle.comdocs.oracle.comr capacity planning.
-
Example – Creating and inserting vectors: Below, we create a simple table with a 3-dimensional INT8 vector and insert a vector value:
Result:
In SQL, vectors can be represented textually by a docs.oracle.comted list of values in square bracketsdocs.oracle.comdocs.oracle.com. Oracle will automatically cast this text to the VECTOR type. In the example above,
'[10, 20, 30]'
is inserted as a 3-dimensional vector of INT8.
docs.oracle.comdocs.oracle.com
Note: If a vector column is defined with a fixed dimension, inserting a vector of a different length will raise an error. For instance, a
VECTOR(128, FLOAT32)
column cannot accept a 256-dimension vector. All vectors in that column must have the same dimensionalitydocs.oracle.com. If you define a flexible dimension (using*
), you can mix dimensionsdocs.oracle.comer that you cannot create a single index on mixed-dimension datadocs.oracle.com (because distance computations require consistent dimensions).
Vector Distance Functions and Similarity Metrics
Once data is stored as vectors, the core operation is measuring the distance or similarity between two vectors. Oracle provides the VECTOR_DISTANCE(vector1, vector2, metric)
SQL function to compute a similarity scoredocs.oracle.comdineshbandelkar.com. This function supports several distance metrics (also called similarity metrics) to suit different embedding models:
-
Supported metrics:
EUCLIDEAN
(a.k.a. L2 distance),EUCLIDEAN SQUARED
(L2 squared),COSINE
(for cosine similarity),DOT
(dot product),MANHATTAN
(L1 distance), andHAMMING
(for binary vectors)docs.oracle.com. If no metric is specified, the default is COSINEdocs.oracle.com.-
Cosine similarity is common for text embeddings (measures the angle between vectors, ignoring magnitblog.marvik.ai implementation likely returns 1 – cos(angle) as a distance (so smaller is more similar) or uses internal transformations such that a lower score means closer.
-
Euclidean is the straight-line distance in the vector space.
-
Dot uses the dot product (often used when embeddings are normalized or when higher dot = more similarity; Oracle treats it as a distance metric for convenience, possibly by ranking larger dot products as “closer”).
-
Manhattan is sum of absolute differences, and Hamming counts bit differences (for binary vectors).
-
-
Choosing the right metric: Use the distance measure recommended by your embedding model blog.marvik.ai. For example, OpenAI’s text embeddings are typically compared with cosine similarity, while some computer vision models might prefer Euclidean or dot product. The distance function in queries must match the metric the index was created with (see Vector Index section) or the index won’t be used docs.oracle.comdocs.oracle.com.
-
Example – Using VECTOR_DISTANCE: Suppose
docs(embedding VECTOR(768,FLOAT32))
and you have a query vectorq_vec
(perhaps representing a search query in the same 768-dim space). You can find the top 5 most similar documents by cosine similarity:This will compute the cosine distance between the
embedding
and the query vector for each row, then return the 5 with smallest distance (most similar). We useFETCH FIRST N ROWS ONLY
to limit results (this is essentially an exact K-nearest neighbors search, scanning data without using an approximate index). -
Performance: Without an index,
VECTOR_DISTANCE
in a query will compute distances row-by-row (full scan or range scan). This is fine for smaller tables or initial testing. However, for millions of vectors, consider using vector indexes (approximate search indexes) to speed up queries (discussed in the next section). -
Distance calculations in PL/SQL: You can call
VECTOR_DISTANCE
in PL/SQL contexts as well (e.g., selecting into a variable). Oracle also provides a PL/SQL packageDBMS_VECTOR
with a utility functionDBMS_VECTOR.QUERY
to perform similarity search, and even aDBMS_VECTOR.RERANK
for refining results docs.oracle.comdocs.oracle.com. But for fundamentals, understanding the SQL function is suffic techstrongitsm.comML Operations on Vectors (INSERT, UPDATE, DELETE) -
Inserting vectors: As shown earlier, you can insert vector values using the textual
[x,y,...]
representation or by using theTO_VECTOR
function to cast from text/JSON. For example:This will store the given array as a vector. The
TO_VECTOR
function can parse a JSON array or the bractechtarget.comon into the VECTOR type. If the column has a defined dimension and the input doesn’t match it, an error occursdocs.oracle.com (so ensure your data is in the correct length). -
Updating vectors: Vector columns can be updated like any other column, e.g.:
This replaces the old vector with a new one. Under the hood, updating a vector will mark associated vector indexes (if any) for maintenance or accuracy checks. Note that HNSW indexes are static once built (no DML allowed; see Vector Idocs.oracle.comon), and IVF indexes can degrade in accuracy with updates blog.marvik.ai. Thus, if you update vector data frequently, you may need to rebuild indexes periodically.
-
Deleting vectors: Deleting rows that contain vectors is straightforward with
DELETE
statements. If a vector index exists on the table, Oracle will handle removing index entries. For IVF, large deletions or insertions might reduce blog.marvik.ai over time docs.oracle.com. You can monitor index quality and consider rebuilding if necessary (usingDBMS_VECTOR.INDEX_ACCURACY_QUERY
andDBMS_VECTOR.REBUILD_INDEX
docs.oracle.com). -
Transaction considerations: Vector data participates in transactions like normal data. All regular ACID properties apply. One thing to be mindful of is consistency of dimensions: if you initially allowed a mix of dimensions by declaring
VECTOR(*, *)
, ensure at application level that you don’t insert inconsistent embeddings by mistake. It’s often best to enforce one dimension per table (either by declaring it or by only using one model for that table).
DDL Operations on Vectors (Table and Index Definitions)
-
Creating tables with VECTOR columns: We have seen examples of
CREATE TABLE
with a vector column. You can alsoALTER TABLE
to add a vector column. For instanconeoracledeveloper.comdocs.oracle.comdocs.oracle.comtechstrongitsm.comdocs.oracle.com(512, FLOAT32); -
Defining constraints: Vector columns can be nullable or not nullable. They can’t be primary keys (since they are complex types), but you could use a primary key on an ID and just store vectors as a regular column. There is no direct “unique” constraint meaning on a high-dimensional vector (and it’s not typically needed).
-
Index DDL: Creating vector indexes is a specialized DDL operation covered in the next section (because it’s an exam topic on its own). In short, you use
CREATE VECTOR INDEX ...
syntax. There are currently no function-based indexes on expressions involving vectors docs.oracle.com (you index the vector column itself). Also, only one vector index can exist per vector column docs.oracle.com. -
Metadata and dictionary views: After defining vector columns or indexes, you can find them in Oracle’s data dictionary:
-
User/All/DBA_TAB_COLUMNS will show the data type as
VECTOR
. -
User/All/DBA_INDEXES will list vector indexes with
INDEX_TYPE='VECTOR'
andINDEX_SUBTYPE
indicatingINMEMORY_NEIGHBOR_GRAPH_HNSW
orNEIGHBOR_PARTITIONS_IVF
docs.oracle.com. -
There are also special views like
VECSYS.VECTOR$INDEX
that store index parameters and status info docs.oracle.com.
-
-
Compatibility: To use the vector data type and indexes, the database COMPATIBLE parameter must be 23.4.0 or higher docs.oracle.com (for Oracle 23c). Ensure your system is properly upgraded, or if using Autonomous Database, use the “23c” preview which has these features.
Practice Questions (Vector Fundamentals):
-
You create a table with a
VECTOR(128, FLOAT32)
column. What will happen if you attempt to insert a 256-dimensional vector into this column?-
Answer: The INSERT will fail with an error (ORA-51801), because the vector’s dimension doesn’t match the column’s defined 128 dimensions docs.oracle.comdocs.oracle.com. Oracle requires all vectors in a fixed-length vector column to have the same number of dimensions.
-
-
Which of the following distance metrics is the default for
VECTOR_DISTANCE
if none is specified: (A) EUCLIDEAN, (B) COSINE, (C) DOT, (D) MANHATTAN?-
Answer: (B) COSINE is the default distance metric if not specified docs.oracle.com. (Oracle supports Euclidean, Cosine, Dot, Manhattan, Hamming, etc., but Cosine similarity is used by default for vector searches.)
-
-
True or False: Oracle’s VECTOR data type can store vectors of different numeric types (e.g. INT8 and FLOAT32) in the same column.
-
Answer: True. If you declare a column as
VECTOR(*, *)
(no fixed format), Oracle will allow any format and dimensiondocs.oracle.comdocs.oracle.com. However, mixing formats is usually discouraged unless necessary, and mixing dimensions will prevent indexing on that column docs.oracle.com. In practice, you’d pick one format for consistency.
-
2. Using Vector Indexes (15%)
For large datasets, vector indexes dramatically speed up similarity searches by avoiding scanning every vector. Oracle 23c supports Approximate Nearest Neighbor (ANN) indexing methods that trade a negligible amount of accuracy for huge gains in query performance blog.marvik.ai. There are two vector index types in Oracle: HNSW (Hierarchical Navigable Small World) and IVF (Inverted File). This section explains how to create and use these indexes.
Overview of Oracle Vector Indexes
-
Why vector indexes? – A brute-force search computes distance to every vector, which is O(N) per query. Vector indexes use clever data structures to prune the search space, finding the nearest neighbors much faster (sub-linear time). Oracle’s implementations are based on popular ANN algorithms:
-
HNSW builds an in-memory graph of proximity (a navigable “small world” network of vectors).
-
IVF partitions vectors into clusters and searches only the most likely clusters for each query.
-
-
Index types: Oracle categorizes vector indexes into two families docs.oracle.com:
-
In-Memory Neighbor Graph index – this refers to HNSW. It primarily uses memory for the index graph structure, yielding extremely fast lookups at query time (with CPU-bound processing).
-
Neighbor Partition index – this refers to IVF (specifically IVF Flat). It’s a disk-based index that partitions vectors into “neighbor” groups on storage.
Oracle’s user guide confirms HNSW is the only supported in-memory graph index, and IVF (flat) is the only supported partition index docs.oracle.com. In summary, HNSW and IVF are the two index mechanisms provided.
-
-
General syntax: To create a vector index, use
CREATE VECTOR INDEX
with the target table and vector column, and specify the type via theORGANIZATION
clause:or
The minimal requirement is to specify the column and one of the two organizations (which implies the index type) docs.oracle.com. Additional parameters can tune the index (discussed below).
-
Distance metric in indexes: You can (and should) specify the distance metric the index will optimize for (Cosine, Euclidean, etc.). If not specified, it defaults to COSINEdocs.oracle.com. The metric must match what you’ll use in queries; otherwise, the index won’t be utilizeddocs.oracle.comdocs.oracle.com. For example, if your embeddings are best compared with dot product, create the index with
DISTANCE DOT
and useVECTOR_DISTANCE(..., DOT)
in queries. -
Target accuracy: When building an ANN index, you can set a target accuracy percentage. This is a trade-off knob: 100% means try to find the true nearest neighbors (at cost of more processing), lower values allow more approximation but faster queries blog.marvik.ai. By default, Oracle uses 95% if not specified. You can set it in the
WITH TARGET ACCURACY <percent>
clause during index creation oralytics.com. The actual achieved accuracy can be measured after creation (withINDEX_ACCURACY_QUERY
) and you can override target accuracy per query if needed (see Similarity Search section). -
Parallel and partitioning: You can create indexes in parallel by adding a
PARALLEL n
clause like other indexes connor-mcdonald.com (useful for large data indexing). Global (table) partitioning of the index is also supported; an index can span partitions of a table globally docs.oracle.com, though you cannot yet create a local partitioned vector index tied to each table partition – it’s essentially global only. -
Restrictions: Vector indexes cannot be created on certain special table types (external tables, clustered tables, GTTs, etc.)docs.oracle.com. Also, only one vector index per column is alloweddocs.oracle.com (you choose either HNSW or IVF). If you attempt to create a second index on the same vector column, it will error out.
Next, we delve into each index type:
HNSW Index (Hierarchical Navigable Small World Graph)
The HNSW index is an in-memory graph structure for approximate nearest neighbor search:
-
Characteristics: HNSW indexes are kept in the vector memory pool (in SGA) and primarily use RAM for query-time speed oralytics.com. They organize vectors in layers of a graph where each vector (node) links to its nearest neighbors, forming a “small world” network that can be navigated quickly oralytics.com. This algorithm is known for high recall (accuracy) even at low query times – great for real-time search.
-
Memory allocation: You must allocate adequate memory for HNSW indexes via the
VECTOR_MEMORY_SIZE
parameter (this is similar to Oracle’s In-Memory column store concept, but separate) oralytics.com. For example,ALTER SYSTEM SET vector_memory_size = 2G SCOPE=SPFILE;
then restart, to reserve 2GB for vector indexes. Oracle provides a viewV$VECTOR_MEMORY_POOL
to check memory usage dineshbandelkar.com and a functionDBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR
to estimate needed memory for an index docs.oracle.com. Insufficient memory will limit index size (or cause usage of spill-to-disk perhaps). -
HNSW parameters: When creating an HNSW index, you can optionally provide parameters:
-
neighbors <M>
– the maximum number of neighbors each node links to (often called M in HNSW literature). Higher M means more connections (higher memory and build time, potentially better accuracy) sessionize.com. -
efConstruction <EF>
– the number of neighbors to consider during index construction (controls quality vs. build speed). Higher EF gives better graph quality (and thus search accuracy) but slower index build. -
If not provided, Oracle will use some defaults (often HNSW defaults might be M=16, EF=100 or similar – exact values are in docs). You must specify
type HNSW
in the PARAMETERS clause if you include these, e.g.PARAMETERS (type HNSW, neighbors 32, efconstruction 300)
connor-mcdonald.com.
-
-
Creating an HNSW index – example:
In this example, we build an HNSW index on the
documents.embedding
vector column, using Cosine distance, aiming for 95% accuracy. We set HNSW-specific parameters: each vector connects to up to 40 neighbors in the graph, and the index construction will use anefConstruction
of 500 for quality. (These are relatively high values, favoring accuracy; defaults might be lower.) -
Usage: After creation, Oracle keeps the HNSW graph in memory. Querying with
... ORDER BY VECTOR_DISTANCE(column, :query_vec, COSINE) FETCH APPROX ...
will traverse the graph to find nearest neighbors (see next section for query syntax). Because it’s approximate, some nearest neighbors might be missed, but typically at 95% target accuracy, almost all top results are correct. You can increase accuracy per query (at cost of checking more nodes) by specifying a higher efSearch via a query hint (not shown here, but Oracle allows overriding accuracy in the fetch clause). -
Limitations: HNSW indexes are static once built. No DML (insert/update/delete) is allowed on the base table if an HNSW index exists blog.marvik.ai. In fact, Oracle’s documentation indicates you should not perform DML after creating an HNSW index – you’d have to drop and rebuild the index if data changes. This is because maintaining the graph incrementally is complex and not supported in this release. Therefore, HNSW is best for largely read-only datasets or scenarios where you can batch updates and rebuild indexes.
-
RAC considerations: HNSW indexes are not available on RAC (Real Application Clusters) in this release blog.marvik.ai. They are local to one instance’s memory. If you need multi-instance, you might have to stick to IVF or use a sharded approach with separate indexes on each shard.
IVF Index (Inverted File Flat Index)
The IVF index takes a clustering approach:
-
Characteristics: IVF stands for Inverted File (Flat) index. It partitions the vector space into a number of clusters (often via k-means or similar). Each cluster has a centroid, and each vector belongs to one cluster. The index essentially is a list of cluster centroids and an assignment of each vector to a cluster. At query time, the search examines the closest cluster(s) to the query vector and then scans vectors only in those clusters oralytics.comblog.marvik.ai.
-
On-disk structure: IVF indexes are stored on disk (and use temporary space heavily during creation) docs.oracle.comdocs.oracle.com. They can handle larger datasets that don’t fit entirely in memory, at the cost that querying might involve disk I/O (though Oracle can cache frequently accessed parts). They are well-suited for very large vector collections.
-
IVF parameters: Key tuning parameters for IVF:
-
neighbor_partitions <P>
– This specifies the number of clusters (partitions) to create or to search. For example,neighbor partitions 100
might create 100 clusters of vectors oralytics.com. A higher number of partitions means each cluster is smaller (faster search within cluster) but you might need to search more clusters to find neighbors if the nearest neighbor isn’t in the top one. -
Oracle’s interface is a bit confusing here: in some examples, they create the index with
ORGANIZATION NEIGHBOR PARTITIONS ... PARAMETERS (type IVF, neighbor partitions 10)
. Theneighbor partitions
in PARAMETERS might indicate the number of clusters to explore for a query (like how many partitions to check). It’s possible that by default the index chooses an internal number of clusters and that parameter might override search behavior. However, Oracle docs suggest it’s the number of clusters in the index when building (likely equal to the number of partitions/clusters) oralytics.com. -
There might also be a concept of “centroids to search” at query time, which could be adjusted via the
WITH TARGET ACCURACY
or by query override. (In some ANN libraries, you specify how many clusters to visit per query, e.g. search top-5 clusters out of 100.)
-
-
Creating an IVF index – example:
This builds an IVF index on
documents.embedding
, using Cosine distance. It sets target accuracy 90% (faster but slightly less accurate than default). The parameters here might indicate using 10 partitions (clusters). With only 10 clusters, each cluster will have many points, but the query will only search a subset of those clusters to find neighbors – likely it will look at a few of the nearest clusters to the query. -
DML on IVF: Unlike HNSW, IVF indexes do allow DML on the base table, but with caveats. If you insert new vectors, they will be indexed (likely Oracle assigns them to clusters on the fly). If you delete or update, those changes reflect in the index. However, as DML occurs, the quality of the index can degrade docs.oracle.com. For example, new vectors might all accumulate in one cluster until a rebalance is done. Oracle provides
DBMS_VECTOR.INDEX_ACCURACY_QUERY
to check if the index is still performing welldocs.oracle.com. If accuracy drops, you can rebuild the index (ALTER INDEX ... REBUILD
or via DBMS_VECTOR) to re-cluster the data properly. Also note: truncating the table invalidates an IVF index (marks it unusable) – you’d need to rebuild after a truncate docs.oracle.com. -
Persistence: Because IVF is on-disk, it persists across database restarts (unlike HNSW which would need to be rebuilt or reloaded into memory each time unless some checkpointing is used). This makes IVF suitable for scenarios where you can’t afford to rebuild indexes on every startup or where the data is truly large.
-
Index size and storage: Oracle recommends having a large TEMP tablespace for building IVF on big datasetsdocs.oracle.com, as the process likely uses sorting/clustering algorithms that use temp space. The final index is stored in the database (likely as index segments and some BLOB metadata). You can inspect
USER_INDEXES
orVECSYS.VECTOR$INDEX
to see index subtype =NEIGHBOR_PARTITIONS_IVF
and possibly parameters like number of partitions etc. docs.oracle.comdineshbandelkar.com. -
Query performance: IVF can scale to millions of vectors. Querying typically involves: compute the query’s cluster (or nearest few clusters), fetch those cluster’s members (which may involve reading from disk if not cached), and then computing exact distances for those. The more partitions (clusters) you have, the fewer vectors per cluster (fast scan), but if you choose too few clusters to search, you might miss some results. Oracle’s target accuracy parameter likely controls how many clusters to consider. E.g., 90% accuracy might search fewer clusters than 99%. You can adjust per query if needed (see next section).
Using Vector Indexes in Queries
Creating the index is half the battle – you must also query in a way that uses the index:
-
Approximate search syntax: Oracle uses an extension of the SQL
FETCH
clause to trigger index usage. You appendFETCH APPROXIMATE FIRST <N> ROWS ONLY
(or the shorthandFETCH APPROX
) to your query docs.oracle.comdocs.oracle.com. This tells Oracle optimizer to consider the ANN index for retrieving the top N similar items. For example:Here,
FETCH APPROXIMATE FIRST 3 ROWS ONLY
signals an ANN search for the 3 nearest neighbors by cosine distance, using any available vector index onembedding
. If an index with matching metric exists (as created in the example above), the index will be used to satisfy the query quicklydocs.oracle.com. If no index, Oracle would ignore theAPPROXIMATE
and just do a full sort (so it’s backward compatible – but obviously slower).Important: The
APPROXIMATE
(orAPPROX
) keyword is required to get the performance benefitdocs.oracle.com. Without it, Oracle will do an exact sort of all distances. The presence ofAPPROXIMATE
tells the optimizer it can do a “top-N approximate” using the index. -
Ensuring index usage: In summary, to use a vector index:
-
Include
APPROX
/APPROXIMATE
in the fetch clausedocs.oracle.com. -
Use the same distance function as index metricdocs.oracle.com (mismatch = no index usage).
-
Do not wrap the
VECTOR_DISTANCE
in another function or arithmetic – it must be a simple ORDER BY on the distancedocs.oracle.com. -
Don’t use certain SQL features that inhibit index usage (e.g., partitioned row filters, or mixing exact and approximate logic in a complex way)docs.oracle.com.
If these conditions are met, Oracle’s plan should show a VECTOR INDEX RANGE SCAN or similar in the explain plandineshbandelkar.comdineshbandelkar.com.
-
-
Overriding accuracy: If you want to override the index’s default accuracy or search parameters at query time, Oracle lets you specify
WITH TARGET ACCURACY <p>
or evenWITH TARGET ACCURACY PARAMETERS (...)
in the fetch clausedocs.oracle.comdocs.oracle.com. For example:This will use a 90% target accuracy for this query (maybe searching fewer neighbors or clusters). Or:
In HNSW,
efSearch
is a query-time parameter (how many candidates to evaluate). Oracle allows specifying it like thisdocs.oracle.com. Increasingefsearch
can improve recall at cost of more work per query. -
Multi-threaded search: It’s not explicitly documented, but since you can create the index with parallel degree, Oracle might also search in parallel threads if you hint parallelism or if it internally does for large top-K. This might not be needed often, as the index search is very fast single-threaded, but keep in mind for huge data.
-
Maintaining index quality: Over time, if you’ve done DML (especially on IVF indexes), periodically verify the index accuracy. You can run:
This will output a JSON or string report of accuracy (it compares an exact top-10 vs indexed top-10)oralytics.com. If it shows significantly less than target accuracy, consider rebuilding the index:
or using
DBMS_VECTOR.REBUILD_INDEX
. Rebuilding will refresh clustering without needing to drop/recreate. -
Resource usage: Keep an eye on memory usage for HNSW. Use
V$VECTOR_MEMORY_POOL
to see how much of the vector pool is used by each indexdineshbandelkar.com. If you have multiple HNSW indexes and limited memory, you might need to size the pool accordingly (or consider IVF if memory is a constraint). -
One index per query: Oracle currently does not combine multiple vector indexes in a single query. If you had two different vector columns, you’d query them separately. There is no “index merge” or such (it wouldn’t make sense to do multi-index merge for a single similarity metric). So each query uses at most one vector index. Plan accordingly (usually you have one main embedding column per table to index).
Practice Questions (Vector Indexes):
-
You created an HNSW vector index on a table. What must you include in your SELECT query to ensure the index is used for faster search?
-
Answer: You must include the
APPROXIMATE
(orAPPROX
) keyword in theFETCH FIRST ... ROWS ONLY
clause of the querydocs.oracle.com. For example:... ORDER BY vector_distance(... ) FETCH APPROXIMATE FIRST N ROWS ONLY
. This signals the optimizer to use the HNSW index for an approximate search instead of doing a full sort.
-
-
Which statement is TRUE about HNSW vs IVF indexes in Oracle?
A. HNSW indexes are stored on disk and allow efficient inserts/updates.
B. IVF indexes partition vectors into clusters and can degrade in accuracy after many updates.
C. HNSW indexes can be used on RAC instances for distributed in-memory search.
D. IVF indexes cannot be rebuilt without dropping and recreating the index.-
Answer: B is true. IVF (Inverted File) indexes cluster the dataoralytics.com, and their accuracy can degrade with lots of DML (thus requiring periodic rebuilds)docs.oracle.com. Option A is false (HNSW is in-memory, not on disk, and does not support inserts/updates without rebuildblog.marvik.ai). C is false (HNSW is not supported on RAC in the current releaseblog.marvik.ai). D is false (IVF indexes can be rebuilt in-place with
ALTER INDEX ... REBUILD
to refresh clustering).
-
-
After creating an IVF index with
TARGET ACCURACY 95
, you want a particular query to run faster at maybe 80% accuracy. How can you achieve that?-
Answer: Specify a lower accuracy in the query’s FETCH clause. For example:
FETCH APPROXIMATE FIRST 5 ROWS ONLY WITH TARGET ACCURACY 80
docs.oracle.com. This will override the index’s default accuracy for that query, likely reducing the number of clusters or vectors searched, thereby increasing speed (at the risk of slightly less accuracy).
-
3. Performing Similarity Search (15%)
This section covers how to execute similarity searches in Oracle – both exact searches (for full accuracy) and approximate searches using the indexes – as well as performing multi-vector similarity search (searching with multiple query vectors or ensuring results cover multiple categories/documents).
Exact vs. Approximate Similarity Search
-
Exact similarity search: An exact search finds the true nearest neighbors by computing all distances. In Oracle, you do this by not using the
APPROXIMATE
keyword. For example:This will sort the entire table by distance to
:qvec
and return the top 10. It yields exact results (the true 10 closest vectors)blog.marvik.ai. Under the hood, Oracle will likely do a full-table scan and sort (or use a partial sort / top-N sort algorithm). Exact search is acceptable for smaller tables or when 100% precision is needed and performance is manageable. However, it doesn’t scale well to very large datasets (millions of embeddings), which is why approximate indexes are introduced. -
Approximate similarity search: As discussed, to do an ANN search, you use the
FETCH APPROXIMATE
clause with an index in place. For example:This will use the vector index (HNSW or IVF) to quickly retrieve 10 neighborsdocs.oracle.com. The results should be the 10 nearest or very close to it, but there is a small chance some neighbors are not the absolute closest if the index is not at 100% accuracy. In practice, at high target accuracy (90-95%), the returned set is usually identical to the exact set. The benefit is speed – these queries are much faster for large data (sub-second even if the table has millions of rows, depending on hardware and index parameters).
-
When to use which: If your data volume is low (say thousands of vectors) or if absolute precision is required (and time is not critical), you can run exact searches. If you have large data (hundreds of thousands or more) or need very fast response (real-time applications), use approximate search with indexes. Oracle allows mixing approaches too – for instance, you could use an approximate search to get say top 100, then re-rank them exactly (though Oracle provides a
DBMS_VECTOR.RERANK
to do something like that if neededdocs.oracle.comdocs.oracle.com). -
Quality of ANN results: You can measure how good the approximate results are by checking recall (the percentage of true neighbors found). Oracle’s
INDEX_ACCURACY_QUERY
essentially does this by comparing to a brute-force resultdocs.oracle.com. For mission-critical use (like if slightly wrong nearest neighbor could cause an issue), test with some queries to ensure the approximation is acceptable. Often a slight drop (e.g., missing 1 out of top 10) is tolerable in AI applications. -
Combining with filters: You can combine similarity search with other predicates. For example:
This will first filter to Electronics, then do an approximate top-5 by dot product. The vector index can still be used if it supports a filter (likely the index is applied after the filter in execution). Oracle’s optimizer might apply the category filter first (using normal index if available on
category
), then the vector index for nearest neighbors among that subset. If the subset is still large, the vector index helps; if the subset is tiny, it might just do exact. -
Hybrid searches: A hybrid query uses both vector similarity and standard conditions, possibly balancing them in a scoring. Oracle 23c also introduced “hybrid search” where you can combine vector distance with text keyword search. For example, you might want items that are semantically similar but also contain a certain keyword. Oracle’s documentation mentions Hybrid Search as a concept (combining vector and relational filters or text)docs.oracle.com. While not explicitly in exam topics, just know that you can incorporate vector search into SQL queries like any other function – any normal WHERE clause can coexist with the ORDER BY vector_distance.
Multi-Vector Similarity Search
Multi-vector search refers to querying with multiple query vectors or ensuring the results are diversified across different query criteria. In Oracle’s context, one prominent use-case is multi-document search – for example, in a document retrieval scenario, you might want the top relevant piece from each document rather than all top pieces from the same document.
-
Use-case: Imagine you have a table of document chunks with a column
doc_id
(identifying which document the chunk came from) andchunk_embedding
. A normal similarity search might return multiple chunks from the single most similar document. But if the goal is to find the most relevant documents (not just chunks), you’d want only the top chunk per document, and then to take the best among those. This is where multi-vector (or multi-group) search comes in. -
Oracle’s approach: Oracle allows a
FETCH ... *PARTITIONS BY* ...
syntax to facilitate this. This is essentially using the SQL 2012 standard “top-N per group” feature with an extension for approximate. Specifically:In this query,
PARTITIONS BY doc_id 1 ROWS ONLY
means: take the closest chunk perdoc_id
(per document), i.e., partition the data by doc_id and within each partition find 1 closest row. Then, out of those partition winners, return the first 2 partitions (so 2 documents). This effectively gives the top chunk from the top 2 documents relevant to the querydineshbandelkar.com.Oracle’s documentation calls this multi-vector similarity searchdocs.oracle.com, but it’s basically a SQL way to enforce diversity (one result per group). The term “multi-vector” might also be interpreted as using multiple query vectors at once (e.g., if you had multiple criteria vectors and wanted results that are close to all), but the exam description specifically says “for multi-document search” which aligns with the partitioned fetch approach.
-
How it works: The
FETCH FIRST N PARTITIONS BY col, M ROWS ONLY
clause is a powerful extension. It finds the top M rows for each distinct value ofcol
(heredoc_id
), then returns the top N of those groups according to the overall ORDER BY. In the above example,M=1
(one chunk per doc) and we takeN=2
partitions (so 2 docs). This ensures no document is represented more than once in the results, and we get the two most relevant documents. -
Example scenario: Suppose Document A has chunks that rank 1st, 2nd, 5th closest to the query, and Document B has its closest chunk ranking 3rd, Document C’s best is 4th. A normal top-5 would give: A1, A2, B1, C1, A3. But using the partition approach for top 3 documents (N=3, M=1) would give: A1, B1, C1 – the best chunk of A, best of B, best of C. This way, you cover more documents in the top results. This is extremely useful for RAG (Retrieval Augmented Generation), because you often want to feed the LLM information from different sources, not five nearly-duplicate chunks from one source.
-
Multi-query vectors: Another interpretation: If you literally have multiple query vectors (say you want items similar to either of two different vectors), you could query them separately and merge results. Oracle doesn’t directly support multiple vector inputs in one
VECTOR_DISTANCE
call (it compares pairwise). But you could do something like:This finds items that are close to either query1 or query2 by taking the min distance as a score. This is a form of multi-vector query too, though not as commonly needed. It might not use indexes effectively (and is not an official feature), but conceptually it’s possible if needed.
-
Summary: Multi-vector similarity search in the context of the exam is about retrieving results across multiple categories (like documents) by using the SQL partition-by trickdineshbandelkar.com. It’s a unique feature that not all databases have natively, and it’s very handy for diversifying results.
-
Performance considerations: The
PARTITION BY ...
in the FETCH clause likely means Oracle will use the vector index to find nearest chunks, then within each doc it keeps track of the best. It might internally iterate through nearest neighbors and pick one per doc until it has N docs collected. This could be slightly more overhead than a plain top-N search, but still far faster than doing separate searches per doc. It’s a single query handled by the engine.
Practice Questions (Similarity Search):
-
You have 10 million vectors and want to find the top 5 most similar to a query vector. What is the most efficient way in Oracle to do this?
-
Answer: Create a vector index (HNSW or IVF) on the data and perform an approximate similarity search using
ORDER BY VECTOR_DISTANCE(...) FETCH APPROXIMATE FIRST 5 ROWS ONLY
docs.oracle.com. This will use the index to quickly find the 5 nearest neighbors, which is far more efficient than scanning all 10 million vectors.
-
-
In a document retrieval scenario, how can you ensure that your query returns at most one result per document, even if one document has many top relevant chunks?
-
Answer: Use a multi-vector (partitioned) similarity search with the
FETCH FIRST ... PARTITIONS BY document_id ...
clausedineshbandelkar.com. For example,FETCH FIRST 3 PARTITIONS BY doc_id 1 ROWS ONLY
will give the top 1 chunk from each of the top 3 documents, ensuring diversity of documents in the results.
-
-
True or False: Adding
FETCH APPROXIMATE
to a similarity query will always return the exact same top results as an exact search.-
Answer: False.
FETCH APPROXIMATE
uses ANN indexes and might return slightly different results if it sacrifices some accuracy for speed. Usually at high target accuracy it’s the same, but it’s not guaranteed to match exactly unless target_accuracy is 100% (and even then, some edge cases). The purpose is to allow a slight difference for much faster performanceblog.marvik.ai.
-
4. Using Vector Embeddings (15%)
This section addresses how to generate and manage the vector embeddings themselves – both outside the database (using external ML models or services) and inside the database (using Oracle’s built-in capabilities). We also cover best practices for storing these embeddings in Oracle and tools for loading them.
Generating Vector Embeddings Outside Oracle
Often, embeddings are produced by machine learning models outside of the database. For example, you might use Python libraries or AI services to convert text or images into vectors:
-
Common libraries/services: You can use frameworks like PyTorch or TensorFlow (with pretrained models), libraries like Hugging Face Transformers, or cloud APIs (OpenAI, Azure Cognitive Services, etc.) to generate embeddings. For text, OpenAI’s GPT or Ada models can produce 1536-dim embeddings; for images, models like CLIP or ResNet can produce vectors; for recommendations, you might have custom models.
-
Process: Typically:
-
Extract or prepare the data you want to embed (e.g. sentences, product descriptions, images).
-
Feed them into the ML model to get embedding vectors (numpy arrays or Python lists).
-
Load those vectors into the Oracle database (via inserts or bulk loading).
-
-
Example (Python with HuggingFace):
text = “Oracle Database 23c introduces vector search.”
inputs = tokenizer(text, return_tensors=‘pt’)
with torch.no_grad():
embeddings = model(**inputs).last_hidden_state.mean(dim=1).numpy() # get sentence embedding
vec = embeddings[0] # numpy array
# Connect to Oracle and insert
import oracledb # (cx_Oracle’s new name)
conn = oracledb.connect(user=“usr”, password=“pwd”, dsn=“db_host/db_service”)
cur = conn.cursor()
# Convert numpy array to list of floats for insertion:
vector_list = vec.tolist()
# Insert into a table with a VECTOR column
cur.execute(“INSERT INTO my_vectors(id, embedding) VALUES (:1, to_vector(:2))”,
[101, str(vector_list)]) # converting list to string like “[0.123, 0.456, …]”
conn.commit()
In this pseudo-code, we generated an embedding for a sentence using a transformer model and inserted it into Oracle. We used to_vector(:2)
assuming Oracle will cast a string representation of the list to the vector type (the :2
parameter might need to be a JSON or string like '[0.123,0.456,...]'
). Another approach is to pass the vector as a binary blob if using Oracle’s binary vector format, but textual is easier.
-
Batch loading: If you have many embeddings (say you vectorized an entire corpus), doing individual inserts will be slow. Instead, you can use SQL*Loader or Oracle external tables to bulk load. Oracle’s SQLLoader (sqlldr) in 23c supports reading vectors from text files or even a binary format
.fvec
(which is a simple binary float array format)blog.marvik.ai. This allows high-speed loading of large vector datasets. We’ll cover SQLLoader more in the related tools section. -
Data formats: When exporting from Python, you might create a CSV where one column is the vector. You could store it as a JSON string like
"[0.1,0.2,...]"
in the file. SQL*Loader can parse that into a VECTOR if configured. Alternatively, use the binary.fvec
(which Oracle can directly ingest as a BLOB for the vector). The choice depends on volume and convenience – text is easier to inspect, binary is more efficient. -
Third-party vector databases: As an aside, sometimes vectors might already reside in a vector database or another source. Oracle’s strategy (and this exam’s perspective) is you can consolidate by bringing those vectors into Oracle, avoiding the need for a separate vector DB. GoldenGate could even replicate from another source if needed. But the simplest: export vectors and IDs to CSV/JSON, then load into Oracle.
Generating Vector Embeddings Inside Oracle
Oracle Database 23c provides the ability to generate embeddings within the database, which is a standout feature. This is done via the Oracle AI Vector Search Vector Utilities:
-
ONNX model support: Oracle can import pre-trained machine learning models in the ONNX (Open Neural Network Exchange) format into the database. Once a model is loaded, you can call it on data stored in Oracle to produce embeddingsdocs.oracle.com. This works for text, image, or other data, as long as you have an ONNX model that takes that data type and outputs a vector.
-
Loading models: Use
DBMS_VECTOR.LOAD_ONNX_MODEL
to load a model file into the databasedocs.oracle.com. This will store the model in an internal table (likely as a BLOB) and register it with a name. For example:Once loaded, the model can be used in SQL functions.
-
VECTOR_EMBEDDING SQL function: Oracle provides a SQL function
VECTOR_EMBEDDING(model_name USING data_column AS data)
that applies a loaded embedding model to input datadineshbandelkar.com. For instance, if you have a tabletexts(content CLOB)
and loaded a text embedding model namedAIDEMO_DOC_MODEL
, you can do:This returns a VECTOR for each content. You can insert that into your vector table or use it on the fly in queries. In the example from the Oracle demo, they accepted a text input and did:
to get a query vector inside PL/SQLdineshbandelkar.com.
-
PL/SQL APIs: The
DBMS_VECTOR
package also hasUTL_TO_EMBEDDING
orUTL_TO_EMBEDDINGS
which are procedures to generate embeddings from data, possibly handling multiple pieces of data (like chunking text then embedding each chunk) in one godocs.oracle.com. The demo snippet we saw usedDBMS_VECTOR_CHAIN.UTL_TO_CHUNKS
andUTL_TO_EMBEDDINGS
to chunk a large document and get embeddings for each chunk in a pipelinedineshbandelkar.com. Essentially, Oracle can orchestrate the whole pipeline of splitting text, embedding, and storing results via these utilities. -
In-database vs. outside: Generating inside Oracle is convenient (no need to move data out) and ensures the data never leaves the database environment, which can be good for security. However, you need an ONNX model – which might not always be readily available for latest architectures (you might have to convert a PyTorch/TensorFlow model to ONNX). Oracle’s approach allows you to also call out to third-party APIs if needed:
DBMS_VECTOR.UTL_TO_EMBEDDING
can call REST APIs for embedding if you provide an endpoint and credentialsdocs.oracle.comdocs.oracle.com. This means even if you use OpenAI or others, you could orchestrate it from PL/SQL (though that might be slower due to network calls). -
Performance: In-DB embedding generation can use Oracle’s CPU resources. It might be fine for moderate volumes, but if you need to embed millions of items, you’ll likely do it offline with a dedicated ML pipeline which may use GPUs. The ideal approach could be: use external GPU-based generation for initial bulk, then maybe use Oracle’s internal models for smaller on-the-fly tasks or updates.
-
Choose the approach: The exam expectation is to know both ways. Outside Oracle – use Python/ML tools and then store vectors in Oracle. Inside Oracle – use Oracle Machine Learning (OML) and Vector Search features to generate embeddings directly in SQL/PLSQLdocs.oracle.com. A savvy approach is to combine them: e.g., generate initial corpus embeddings outside (for speed), load into Oracle, then for new incoming data or user queries, use Oracle’s built-in model (perhaps a smaller model) to embed on the fly.
Storing and Managing Embeddings in Oracle
Once you have embeddings, you need to store and maintain them efficiently in Oracle:
-
Schema design: Usually you’ll have a table with an ID, maybe some metadata, and a VECTOR column for the embedding. Example:
Sometimes you might separate the raw data from the embeddings, but having them side by side is convenient and ensures consistency (you can update embeddings whenever you update the source text, for instance).
-
Compression: Since vectors are stored as SecureFile BLOBs internally, Oracle’s compression features for SecureFile might apply. If vectors are large and many, enabling SecureFile compression on the tablespace or using
COMPRESS HIGH
could save space (with some CPU cost). However, if you plan to frequently use them in similarity searches, leaving them uncompressed (or using MEMCOMPRESS for In-Memory if loaded into memory for HNSW) might be better for performance. This is an advanced tuning consideration – not likely on the exam, but good to know. -
Indexing: We covered vector indexes. Typically, right after loading embeddings, you’ll create an index (unless your use-case only does exact search). If you have multiple different types of embeddings (e.g., one for product description text, one for product image), you might have two separate tables or two vector columns in one table. In that case, you could index each column separately. Keep track of which index is on what.
-
Updating embeddings: If the source data changes (e.g., you edited a document), you should regenerate its embedding. Ensure to update the vector in the table. For IVF indexes, this will reflect (with potential slight loss of index optimality until rebuild). For HNSW, you’d drop and rebuild index after a batch of such changes (or consider a design where you can tolerate rebuilding).
-
Loading tools:
-
SQL*Loader: As mentioned, it supports vector data type now. You can specify in the control file something like:
to parse a bracketed list. Or use
LOBFILE
if reading binary. This allows high-speed direct path load of vectors, which is ideal for initial population of millions of vectors. -
Oracle Data Pump: Data Pump (expdp/impdp) fully supports the VECTOR type. You can export a table with vectors and import it to another DB. Data Pump will treat the vector as it would any LOB or user-defined type. This is important for backup, or for moving data from development to production, etc. So yes, you can use Data Pump to unload and load vector data just like any tablemylearn.oracle.com (the exam topic explicitly mentions Data Pump).
-
-
Vector as part of ML pipeline: Oracle’s convergence means you can also use the stored embeddings in its machine learning algorithms. For example, Oracle 23c can use the VECTOR column in SQL analytic functions or in OML algorithms (like maybe OML4SQL could do clustering on vectors, etc.). One can even imagine writing a custom PL/SQL to do k-means on these vectors, but that might be reinventing what the index already does. Still, keep in mind the vector data is accessible for any purpose, not just search.
-
Security: Treat vector data as sensitive if it encodes sensitive info. For example, face embeddings might be considered personal data. Use Oracle’s security features (TDE encryption, redaction, etc.) accordingly. By default, SecureFiles LOBs can be encrypted transparently if you enable it. This might be beyond the exam scope, but relevant in real deployments.
Practice Questions (Vector Embeddings):
-
Name two ways to generate vector embeddings for use in Oracle Database.
-
Answer: One way is to generate embeddings outside the database using Python or ML services (for example, using a HuggingFace Transformer or OpenAI API, then insert the vectors into Oracle). Another way is to generate embeddings inside the database by loading an ONNX model and using Oracle’s
VECTOR_EMBEDDING
function orDBMS_VECTOR
procedures to compute embeddings in SQL/PLSQL docs.oracle.comdineshbandelkar.com.
-
-
You have a CSV file of customer reviews with pre-computed embedding vectors (as 1536 comma-separated numbers in quotes). Which Oracle tool can you use to bulk-load these into a table with a VECTOR column?
-
Answer: Use SQL*Loader. Oracle’s SQL*Loader supports loading vectors from text files blog.marvik.ai. You can define the VECTOR column in the control file format so that the comma-separated list is parsed into the VECTOR data type. This will efficiently batch load all embeddings.
-
-
How can Oracle Database generate a vector embedding for a piece of text without any external services?
-
Answer: By using a pre-trained embedding model loaded into the database. You would load an ONNX format model via
DBMS_VECTOR.LOAD_ONNX_MODEL
, then call theVECTOR_EMBEDDING(model_name USING text_column AS data)
function in a query dineshbandelkar.com. Oracle will run the model on the text to produce an embedding vector internally.
-
5. Building a Retrieval-Augmented Generation (RAG) Application (25%)
Retrieval-Augmented Generation (RAG) is an architectural pattern where a Large Language Model (LLM) is supplemented with relevant data retrieved from a knowledge base. In our context, Oracle Database (with vector search) serves as the knowledge store that provides relevant pieces of data to feed into an LLM prompt. This section breaks down RAG concepts and how to implement RAG using Oracle – via PL/SQL and via Python.
What is Retrieval-Augmented Generation (RAG)?
-
RAG concept: RAG combines retrieval (searching for relevant context) with generation (using an LLM to produce an answer or content). For example, imagine a chatbot that can answer questions about your company’s documents. Rather than relying solely on the LLM’s trained knowledge (which might be outdated or incomplete), the system will retrieve the most relevant documents from a database using the query, and then augment the LLM’s input with those documents so that it can generate a factual, up-to-date answer. This reduces hallucination and improves accuracy by grounding the LLM in real data oneoracledeveloper.comdocs.oracle.com.
-
Components: A typical RAG pipeline has:
-
Question processing – take the user’s question or query.
-
Embedding the query – convert the question into a vector embedding (using the same model as the knowledge base embeddings).
-
Retrieval – perform a similarity search in the vector database (Oracle) to get top relevant documents or chunks.
-
Augmentation – compile those retrieved pieces into a prompt or context.
-
Generation – feed prompt + retrieved context into an LLM (like GPT-4, etc.) to get an answer.
-
Return answer – possibly with references or confidence.
-
-
Oracle’s role: Oracle Database can handle steps 2 and 3 extremely well: it can generate the query embedding (via
VECTOR_EMBEDDING
or an external call) and store/retrieve documents using AI Vector Search. Steps 4-5 (augmentation and generation) typically involve the LLM which might be an external service (OpenAI, etc.) or possibly an on-prem model. Oracle even integrates with this via features like Select AI, which can orchestrate prompts to LLMs directly from the database docs.oracle.comdocs.oracle.com, but we’ll talk about that in the next section. -
RAG in exam scope: The exam expects you to understand RAG concepts oneoracledeveloper.com and how to implement a simple RAG using PL/SQL and Python. So we will outline those implementations next.
RAG with Oracle Database and PL/SQL
Building a RAG workflow in PL/SQL means the database itself will orchestrate the retrieval and possibly even call an LLM API to get the answer:
-
Data preparation: Suppose we have a table
docs_chunks(doc_id, chunk_id, content, embedding VECTOR(768, FLOAT32))
which contains chunks of text from documents and their embeddings. We also have an ONNX model loaded for generating query embeddings (or we use an external API via PL/SQL). -
Retrieval in PL/SQL: We can write a stored procedure that takes a user question and returns an answer. The steps inside:
-
Generate the query’s embedding vector. If a model is loaded, use
VECTOR_EMBEDDING
. For example:In this pseudo-code:
-
We get the query embedding
q_vec
. (If no internal model, we could instead call an API like OpenAI to get the embedding here using UTL_HTTP. Or useDBMS_VECTOR.UTL_TO_EMBEDDING
to call a third-party embedding service if configured docs.oracle.com.) -
We then use a cursor to fetch the top 3 chunks by similarity. This uses exact search for simplicity; in practice, we might want
APPROXIMATE
if performance is needed and an index exists, but since it’s top 3 and presumably index is present, Oracle will likely use it anyway (it can with or without approximate if the optimizer is smart, but to be sure, we could includeFETCH APPROXIMATE
). -
We concatenate those chunks into a single CLOB
l_context
. This is the retrieved context to feed the LLM. -
Finally, we call some
call_llm_api
function (which would use something likeUTL_HTTP
to sendp_question
andl_context
to an AI service, or potentially use Oracle’s Select AI if on Autonomous DB).
The
call_llm_api
might do an HTTP POST to an OpenAI endpoint with the prompt constructed as: “Answer the question based on the context: {context} \n Q: {question} \n A:” and get the response. Oracle’sUTL_HTTP
can be used to call external REST services; credentials can be stored viaDBMS_CLOUD.create_credential
(orDBMS_VECTOR.CREATE_CREDENTIAL
for vector-specific maybe) docs.oracle.com. -
-
Return the answer back to the caller (which could be a user or an application).
-
-
In-DB Generation (Optional): Oracle’s
DBMS_VECTOR
package has aUTL_TO_GENERATE_TEXT
function docs.oracle.com which suggests the ability to generate text from a prompt (maybe calling an LLM under the hood via an API or an internal model if one is loaded, e.g., an ONNX LLM or using cloud AI). In theory, one could useUTL_TO_GENERATE_TEXT
to avoid manual UTL_HTTP. E.g.,DBMS_VECTOR.UTL_TO_GENERATE_TEXT( json('{ "prompt": "some prompt"}') )
– but documentation is sparse. Likely, it leverages an AI service configured in Select AI or OCI. If the exam expects knowledge of this, just note it exists, but implementing RAG via straightforward API calls is fine. -
Putting it together: The entire RAG pipeline can run inside Oracle: from the question to the answer. This is powerful because you could then expose it via an Oracle REST Data Service or PL/SQL API to applications, keeping all data and logic on the DB side.
-
PL/SQL vs external: The advantage of PL/SQL approach is centralized logic and possibly security (the database can control the access to both data and the AI API). The downside is that it might be harder to integrate certain complex LLM logic or parse results in PL/SQL. However, Oracle’s integration features (Select AI) are making this easier by allowing natural language queries and RAG directly in SQL docs.oracle.com.
RAG with Oracle Database and Python
In many cases, application developers will implement RAG in an app layer using Python. Oracle still plays the role of the retriever (knowledge store). Python will handle orchestrating the query to Oracle and calling the LLM.
-
Architecture: A typical Python RAG app might use libraries like LangChain to simplify this. LangChain has an integration for Oracle vector stores python.langchain.com, meaning you can use Oracle as a vector database in a LangChain
VectorStore
with minimal custom code. Even without LangChain, you can do it manually. -
Steps in Python:
-
Embed the query: Use the same embedding model that was used for the documents. This could be a local model or an embedding API. For instance, with OpenAI:
-
Query Oracle for similar vectors: Connect to Oracle using
oracledb
(the Python driver, formerly cx_Oracle). Then execute a parameterized query:Here we passed the query vector as a string like “[0.123,0.456,…]” and used
to_vector(:qv)
in the SQL to cast it dineshbandelkar.com. The database will compute distances and return the top 3 chunks (with their doc IDs and content).Alternatively: If an index is present and we want to ensure use, we could include
APPROXIMATE
in the SQL. The same query withFETCH APPROXIMATE FIRST 3 ROWS ONLY
would leverage the index. Oracle’s client driver supports it just fine. -
Compose the prompt: Take the retrieved chunk texts (say we got 3) and construct a prompt for the LLM. For example:
-
Call LLM for answer: Use an LLM API (OpenAI, etc.) to get completion:
or if using a chat model:
-
Return or display answer.
-
-
LangChain approach: If using LangChain, you’d do something like:
Then use a
RetrievalQA
chain with an LLM from LangChain, etc. Under the hood, it performs similar steps. -
Python vs PL/SQL RAG: Python gives more flexibility in interacting with various AI services and doing complex logic with ease of coding. PL/SQL keeps everything in the DB and could be triggered by DB events or used by internal applications (like APEX or ORDS). Some may use a hybrid: Oracle does retrieval and returns data to a middle-tier which does generation.
-
Performance considerations: Doing the vector search in Oracle is extremely fast (milliseconds). The slow part is calling the LLM API which might be hundreds of milliseconds or more. So overall, the overhead of going from Python to Oracle and back is minimal relative to LLM time. Thus using Oracle as the vector store is quite viable in a Python app. One should use proper connection pooling for Oracle for efficiency (or persistent connections).
-
Oracle Machine Learning (OML) for Python: Oracle has an OML4Py library which allows you to run Python code close to the data. In an Autonomous Database, you could run a Python script inside the database environment (kind of like how you run R or use notebooks within Oracle). OML4Py 2.1 even introduces an
oml.Vector
type to work with vector data in DataFrame-like structures blogs.oracle.com. For instance, you could useoml.push
to push a DataFrame to the DB and it would handle vector types. While it’s advanced, it’s good to know Oracle is working to make Python-Oracle integration seamless for vector data. But for exam prep, basic cx_Oracle/oracledb usage is sufficient.
Practice Questions (RAG and Integration):
-
What are the two main stages of a Retrieval-Augmented Generation (RAG) pipeline?
-
Answer: Retrieval and Generation. In the retrieval stage, the system finds relevant data (using vector similarity search) to ground the response oneoracledeveloper.com. In the generation stage, an LLM (or similar model) produces the final answer using the retrieved data as context.
-
-
In PL/SQL, which package or function can you use to generate a text embedding for a user’s question as part of a RAG workflow?
-
Answer: You can use the
VECTOR_EMBEDDING
SQL function (via a SELECT INTO in PL/SQL) if you have an ONNX model loaded dineshbandelkar.com. For example:SELECT VECTOR_EMBEDDING(MyModel USING :question AS data) INTO query_vec FROM DUAL;
. Alternatively, you could callDBMS_VECTOR.UTL_TO_EMBEDDING
if using a third-party API, butVECTOR_EMBEDDING
is the straightforward way with an in-DB model.
-
-
How can a Python application query Oracle to get the top-K similar results for a query vector?
-
Answer: By using the Oracle Python driver (cx_Oracle/oracledb) to execute a SQL query with
ORDER BY VECTOR_DISTANCE(... ) FETCH FIRST K ROWS ONLY
. The application would first compute the query vector (e.g., via a model or API), then pass it to Oracle (perhaps as a string or binding each dimension) using theto_vector
function in the SQL. Oracle will return the top K rows. For example:This yields the 5 most similar items.
-
6. Leveraging Related AI Capabilities (10%)
Oracle’s ecosystem offers additional features that complement AI Vector Search. These include optimized hardware (Exadata) for performance, data integration tools like GoldenGate for real-time updates, natural language query integration via Select AI, and data loading tools. This section gives an overview of these related capabilities and how they enhance vector search applications.
Exadata AI Storage and Architecture
Oracle Exadata is an engineered database machine that provides top-tier performance for Oracle Database workloads. For AI Vector Search, Exadata’s features can significantly accelerate similarity queries, especially for large datasets:
-
Exadata architecture: Exadata consists of scale-out database servers and scale-out intelligent storage servers, connected by a super-fast RDMA networkoracle.com. It uses smart algorithms so that queries can be processed at the storage tier whenever possible (this is known as Smart Scan). For example, storage nodes can filter rows and columns, apply predicates, and even do some JSON and text processing, sending back only necessary data to DB servers. This massively parallel, intelligent storage architecture yields higher throughput and lower latency for data-intensive operations than a generic system oracle.comoracle.com.
(In summary, Exadata is a cloud-enabled, scale-out architecture with high-performance DB servers, intelligent storage with flash and persistent memory, and RDMA networking. It implements database-aware intelligence in storage, compute, and network to boost performance for all workloadsoracle.com.)
-
Persistent memory & RDMA: Modern Exadatas (X8M, X9M, X10M, etc.) include Persistent Memory (PMEM) modules in storage and use RoCE (RDMA over Converged Ethernet) for networking techstrongitsm.comoracle.com. This means certain read operations can bypass the OS and go directly over the network to memory on storage servers (extremely low latency). For vector search, if the vectors or index structures are in PMEM on storage, the retrieval of those can be very fast, and the RDMA can quickly ship data to compute nodes.
-
Exadata for vector search: Oracle has introduced specific optimizations in Exadata system software for AI workloads:
-
It can offload vector searches to storage. For example, in Exadata X11M, Oracle reports that persistent vector index (IVF) searches are transparently offloaded to the intelligent storage, yielding up to 55% faster performance techstrongitsm.comtechstrongitsm.com. This means the distance calculations and cluster scanning for IVF can happen on storage CPUs, in parallel across all storage servers, rather than on the DB server alone.
-
In-memory HNSW index queries are also up to 43% faster on Exadata X11M techstrongitsm.com, likely due to faster memory and interconnect, and perhaps some parallelism or memory optimizations.
-
Exadata’s storage can perform more data filtering: one report mentions 4.7X more data filtering in storage servers and 32X faster queries for binary vector searches on Exadata with new optimizations techstrongitsm.com. This suggests that even for Hamming distance (binary vectors), Exadata storage offload gives big wins.
Bottom line: Exadata accelerates vector search both by brute hardware improvements and by pushing parts of the search down to storage nodes. The results can be significantly faster response times for similarity queries, especially for IVF (disk-based) where IO is involved. A query for nearest neighbors using IVF index on Exadata might read much less from disk due to that smart offload.
-
-
Use cases: Exadata AI Storage is ideal when you have very large embedding datasets (billions of vectors) and you need fast, consistent performance. It shines in scenarios like:
-
Real-time recommendations or personalization in large e-commerce with millions of products (vector search offloaded to Exadata storage can handle the scale).
-
Enterprise search across huge document archives.
-
Any AI application where the database is under heavy load – Exadata ensures that adding vector search doesn’t bog down other workloads, due to its scale-out nature.
It’s also beneficial if you plan to consolidate multiple workloads (transactional + AI analytics) on one platform, as Exadata is designed for mixed workloads and to avoid interference (e.g., it might run OLTP on DB nodes and do analytic vector scans on storage in parallel).
-
-
How to leverage: From a user perspective, leveraging Exadata is automatic – if you run your Oracle database on an Exadata system (on-prem or Cloud@Customer or in Oracle Cloud), the optimizations happen under the hood. You don’t need to change SQL. Possibly one might set some configuration to enable vector offload (maybe an underscore parameter if it’s not on by default in some versions), but likely it’s on when vector indexes exist. Just ensure you are on a version that supports it (23c on Exadata system software of matching version).
-
Summary: Exadata is the “heavy iron” that can accelerate AI vector search by utilizing specialized hardware and offloading algorithms, resulting in up to 55% faster vector search and improved throughput techstrongitsm.com. It essentially turbo-charges the database for AI.
Oracle Exadata Key Points for Exam:
-
Exadata has intelligent storage offload – it can perform vector similarity computations on storage servers, reducing load on DB servers and improving speed futurumgroup.comtechstrongitsm.com.
-
Exadata X9M/X10M introduced features like persistent memory and RDMA, which lower latency for retrieving vectors.
-
The term **“Ex
Oracle GoldenGate for AI and Vector Data
Oracle GoldenGate is a real-time data replication platform. In the context of AI vector search, GoldenGate can continuously replicate and synchronize data (including tables with VECTOR columns) across databases or regions. This is useful for distributed AI processing – for example, feeding a dedicated read-optimized database for AI queries, or keeping an on-premises database and a cloud database in sync for hybrid workloads.
-
Real-time replication: GoldenGate captures changes (inserts/updates/deletes) from the source database and applies them to the target database in real time. GoldenGate 23c (sometimes called 23ai for its new features) fully supports the vector data type, so embeddings and indexes can be replicated just like other data.
-
RAG use-case: In a RAG pipeline, you might use GoldenGate to ensure that your knowledge base is always up-to-date. For instance, if your OLTP database is where documents are ingested and stored, you can use GoldenGate to stream those documents and their embeddings into a separate analytics database that handles the AI queries. This way, the LLM gets fresh data without impacting the transactional system. Oracle specifically notes that GoldenGate 23ai can “power retrieval-augmented generation (RAG) for generative AI applications” by providing real-time data sync.
-
Scalability: You could also use GoldenGate to shard vector data across multiple databases or to maintain a read replica in another region (for latency or DR). Since GoldenGate is heterogeneous, you could even feed non-Oracle systems if needed (though typically you’d keep it Oracle-to-Oracle to use vector search on both ends).
-
Implementation: Using GoldenGate for vectors doesn’t require special steps – define the replication for the tables containing vectors. GoldenGate will handle the securefile LOB data that represents vectors. On the target, you’d rebuild or sync the vector indexes as needed (in practice GoldenGate would replicate DML, which would automatically maintain an IVF index; for HNSW indexes, you might schedule rebuilds at off hours).
-
Benefit: The key benefit is real-time, consistent data for AI. No staleness. If a document is updated, the embedding update flows through GoldenGate and your search results reflect it within seconds. This is critical for applications where data changes frequently.
Select AI for Natural Language Queries
Oracle Select AI is a feature (primarily in Autonomous Database) that integrates LLMs directly into SQL operations. It allows users to query the database using natural language prompts instead of SQL, and the database, via an AI engine, will convert that to SQL or execute it and return results. It can also leverage vector search under the hood for relevant data and even do RAG automatically.
-
Natural language to SQL: With Select AI, a user could issue:
SELECT AI 'Show me the 5 most similar products to product ID 12345' FROM DUAL;
and behind the scenes, the database will interpret that using a prompt and an LLM, translate it into a SQL query withVECTOR_DISTANCE
etc., execute it, and then possibly even explain it back. This greatly enhances user productivity and accessibility of data. -
Conversations and Chat: Select AI supports multi-turn conversations as well (so you can refine questions). It’s essentially bringing an ChatGPT-like interface to your database, with the assurance that the answers come from your actual data.
-
Select AI with RAG: Importantly, Select AI has a mode to do RAG itself: “Select AI with Retrieval Augmented Generation (RAG) augments your natural language prompt by retrieving content from your specified vector store using semantic similarity search”. This means if you have a vector index (vector store) in the database, Select AI can use it to pull relevant data to feed the LLM, reducing hallucination and making answers based on your data. For example, a prompt “What does our policy say about remote work?” could trigger a vector search on the policies table for “remote work” embeddings, and that content is given to the LLM to formulate a precise answer.
-
AI Providers: Under the hood, Select AI can work with various AI providers (OpenAI, Cohere, etc.) configured in an AI Profile. DBAs can manage this via
DBMS_CLOUD_AI
package. This abstraction means the database can seamlessly call out to an LLM as if it were an internal function. -
Use in exam context: Recognize that Select AI is Oracle’s solution to integrate generative AI with database querying. It lets developers and analysts use natural language directly on data, and it leverages Oracle’s vector search and LLM integration to do so. It is particularly available in Oracle Autonomous Database (in the cloud).
Loading and Moving Vector Data (SQL*Loader and Data Pump)
Integrating vector search into enterprise workflows often requires moving large amounts of data in or out of the database. Oracle’s traditional tools have been updated to handle the new data types:
-
SQL*Loader: This high-speed bulk loading utility can now load data into VECTOR columns. As mentioned, it supports vectors in text form (e.g., a list of numbers in brackets) and in binary form (using a
.fvec
or similar binary representation). By using direct path load, you can insert millions of embeddings efficiently. For example, you might have an external file of 100k product descriptions and their 512-dim embeddings – a SQL*Loader control file can map the embedding field to the table’s VECTOR column and load all in one go. This is much faster than row-by-row inserts. -
Oracle Data Pump: Data Pump (expdp/impdp) fully supports the VECTOR type for export and import. You can export a table containing vectors, and when importing, it will recreate those vectors on the target exactly. This is essential for backup, clone, or migration of a vector search application. For instance, if you build a vector index in a development environment and want to move the whole schema to production, a Data Pump export/import will bring over the tables with vector columns (and you can either rebuild indexes or export the index definitions as well). Data Pump makes moving the AI-enhanced schema no different than any other Oracle schema.
-
External Tables: Another way to bring data in is using External Tables with ORACLE_DATAPUMP or ORACLE_LOADER access drivers. While not explicitly in exam, it’s worth noting you could query an external data source that contains vector data (perhaps in a CSV or Parquet) and use
TO_VECTOR
to cast and insert into your persistent table. -
ETL Tools: If using Oracle GoldenGate, as discussed, that’s more for ongoing sync. For one-time large loads, SQL*Loader and Data Pump are go-to tools.
-
Tip: Always verify after loading that the vectors are as expected (maybe do a sample query or ensure count of dimensions). If using text format loads, watch out for number formatting (use a period for decimal, no thousand separators, etc.). If using binary, ensure endianness is correct.
Practice Questions (Related Capabilities):
-
Your AI application needs to handle very large vector searches quickly. How can Oracle Exadata improve performance for such workloads?
-
Answer: By running the Oracle database on Exadata, you gain smart storage offload and high-speed hardware that can greatly accelerate vector searches. Exadata’s storage servers can independently execute parts of the vector similarity search (especially for IVF indexes) and send back only results, yielding up to ~55% faster persistent vector index searches and ~43% faster in-memory index searches on the latest Exadata X11M. The RDMA network and persistent memory in Exadata also reduce latency for these operations.
-
-
How can you keep two Oracle databases in sync with vector data so that both have the latest embeddings for queries?
-
Answer: Use Oracle GoldenGate 23c/23ai for real-time data replication. GoldenGate will stream the changes (inserts/updates/deletes of vectors) from the source to the target continuously, ensuring both databases have current data. This can power distributed RAG setups or active-active systems where multiple databases need the same up-to-date vector data.
-
-
What is Oracle’s Select AI feature and how does it relate to vector search?
-
Answer: Select AI allows you to use natural language prompts directly in an Oracle SQL context. It translates your request into SQL or PL/SQL, possibly using LLMs. When combined with vector search (Select AI with RAG), it can automatically perform a vector similarity search on your data to gather relevant context and then provide a natural language answer. In short, it’s an Autonomous Database feature that brings conversational AI querying (with vector-based retrieval under the hood) to Oracle.
-
-
Which tool would you use to quickly load a million pre-computed vectors from a CSV file into Oracle?
-
Answer: SQL*Loader. It’s designed for bulk loading and supports the VECTOR column type (by parsing a list of numbers or reading binary files). This will be much faster than inserting via individual SQL statements. After loading, you could then create a vector index on the data.
-
-
True or False: Oracle Data Pump cannot export tables with VECTOR types, so you must omit those columns during export.
-
Answer: False. Oracle Data Pump fully supports the VECTOR data type. You can export and import tables with vector columns just like any other table. The vectors will be preserved on import. No need to omit them – they are treated as LOB data under the covers, which Data Pump handles normally.
-
Sources: The information in this guide is based on Oracle’s official documentation and recent updates for Oracle Database 23c and related tools, including the Oracle AI Vector Search User’s Guide, Oracle blogs, and Oracle Cloud documentation for features like Select AI and Exadata enhancements, among others. Each topic is cross-referenced with authoritative Oracle sources to ensure accuracy and relevancy for the 1Z0-184-25 Oracle AI Vector Search Professional exam. Good luck with your certification preparation!