Oracle Vector Datatype – Updating table data
In my last blog post on Oracle’s Vector data type, I simply showed you how the datatype is used within an Oracle table. In this blog post, we are going to dive a bit deeper and provide some context with it regardiing to updating a table with existing data.
To start, we are going to look at an external table that provides data on video games. This external table is only going to be used to pull in the data we want to us. The outline of the external table is:
drop table vector.video_games;
create table vector.video_games
(
"Title" VARCHAR2(50),
"Features.Handheld?" VARCHAR2(50),
"Features.Max Players" NUMBER,
"Features.Multiplatform?" VARCHAR2(50),
"Features.Online?" VARCHAR2(15),
"Metadata.Genres" VARCHAR2(50),
"Metadata.Licensed?" VARCHAR2(15),
"Metadata.Publishers" VARCHAR2(50),
"Metadata.Sequel?" VARCHAR2(15),
"Metrics.Review Score" NUMBER,
"Metrics.Sales" NUMBER,
"Metrics.Used Price" NUMBER,
"Release.Console" VARCHAR2(50),
"Release.Rating" VARCHAR2(5),
"Release.Re-release?" VARCHAR2(15),
"Release.Year" NUMBER,
"Length.All PlayStyles.Average" NUMBER,
"Length.All PlayStyles.Leisure" NUMBER,
"Length.All PlayStyles.Median" NUMBER,
"Length.All PlayStyles.Polled" NUMBER,
"Length.All PlayStyles.Rushed" NUMBER,
"Length.Completionists.Average" NUMBER,
"Length.Completionists.Leisure" NUMBER,
"Length.Completionists.Median" NUMBER,
"Length.Completionists.Polled" NUMBER,
"Length.Completionists.Rushed" NUMBER,
"Length.Main + Extras.Average" NUMBER,
"Length.Main + Extras.Leisure" NUMBER,
"Length.Main + Extras.Median" NUMBER,
"Length.Main + Extras.Polled" NUMBER,
"Length.Main + Extras.Rushed" NUMBER,
"Length.Main Story.Average" NUMBER,
"Length.Main Story.Leisure" NUMBER,
"Length.Main Story.Median" NUMBER,
"Length.Main Story.Polled" NUMBER,
"Length.Main Story.Rushed” NUMBER
)
ORGANIZATION EXTERNAL
(
default directory dir_temp
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ‘,'
OPTIONALLY ENCLOSED BY ‘"'
)
LOCATION ('video_games.csv’)
) reject limit unlimited;
As you can see, there are a lot data points that we can use. To make this a bit simpler, we are only going to use the first 16 columns. This means we need to create a standard heap table that reference these columns.
create table vector.video_games_vec
(
title VARCHAR2(50),
handheld VARCHAR2(50),
maxplayers NUMBER,
multiplatform VARCHAR2(50),
availiableonline VARCHAR2(15),
genres VARCHAR2(50),
license VARCHAR2(15),
publishers VARCHAR2(50),
sequel VARCHAR2(15),
reviewscore NUMBER,
usedprice NUMBER,
sales NUMBER,
console VARCHAR2(50),
rating VARCHAR2(5),
rerelease VARCHAR2(15),
rereleaseyear NUMBER
);
Notice the difference in table names. The standard heap table has an ending of “vec” compared to the external table. This is to keep our processes separate. At the same time, after we insert data into the heap table, we are only going to use the heap table.
Insert data into heap table (vector.video_games_vec) based on the data in the external table (vector.video_games).
insert into vector.video_games_vec;
select
"Title”,
"Features.Handheld?”,
"Features.Max Players”,
"Features.Multiplatform?”,
"Features.Online?”,
"Metadata.Genres”,
"Metadata.Licensed?”,
"Metadata.Publishers",
"Metadata.Sequel?”,
"Metrics.Review Score”,
"Metrics.Sales”,
"Metrics.Used Price”,
"Release.Console”,
"Release.Rating”,
"Release.Re-release?”,
“Release.Year"
from vector.video_games;
In table vector.video_games_vec, we should now have a bit more than 1200 records.
select count(*) from vector.video_games_vec;
Returns 1209
Now we have a data set to work with. We are going to leave the external table (vector.video_games) in place for additional tests later.
Add a vector column
In order to use the vector.video_games table for semantic searches, we need to add a column for a vector. Since we do not know the number dimensions for the vectors or the formatting, lets assume that all data will be of any format with an unlimited dimensions.
Our alter table command then looks like this:
SQL> alter table vector.video_games_vec add (vg_vec VECTOR(*,*));
If we do a describe on the table, we will see the vector:
Name Null? Type ---------------- ----- ------------ ID NOT NULL NUMBER(38) TITLE VARCHAR2(50) HANDHELD VARCHAR2(50) MAXPLAYERS NUMBER MULTIPLATFORM VARCHAR2(50) AVAILIABLEONLINE VARCHAR2(15) GENRES VARCHAR2(50) LICENSE VARCHAR2(15) PUBLISHERS VARCHAR2(50) SEQUEL VARCHAR2(15) REVIEWSCORE NUMBER USEDPRICE NUMBER SALES NUMBER CONSOLE VARCHAR2(50) RATING VARCHAR2(5) RERELEASE VARCHAR2(15) RERELEASEYEAR NUMBER VG_VEC VECTOR
However, it doesn’t tell us size of the vector. This is limitation in the VSCode interface we are using. If we go to a command prompt, we can run the same commands and see the size of the vector.
SQL> desc vector.video_games_vec;
Name Null? Type ------------------------------ -------- —————————————— ID NOT NULL NUMBER(38) TITLE VARCHAR2(50) HANDHELD VARCHAR2(50) MAXPLAYERS NUMBER MULTIPLATFORM VARCHAR2(50) AVAILIABLEONLINE VARCHAR2(15) GENRES VARCHAR2(50) LICENSE VARCHAR2(15) PUBLISHERS VARCHAR2(50) SEQUEL VARCHAR2(15) REVIEWSCORE NUMBER USEDPRICE NUMBER SALES NUMBER CONSOLE VARCHAR2(50) RATING VARCHAR2(5) RERELEASE VARCHAR2(15) RERELEASEYEAR NUMBER VG_VEC VECTOR(*,*)
When we query the vector.video_games_vec and look for the vector, we will see that no vector information is available.
SQL> set linesize 150; SQL> select title, vg_vec from vector.video_games_vec where rownum <=5;
ID TITLE VG_VEC ---------- -------------------------------------------------- ———————————————————————————————————————— 133 Battles of Prince of Persia 134 GripShift 135 Marvel Nemesis: Rise of the Imperfects 136 Scooby-Doo! Unmasked 137 Viewtiful Joe: Double Trouble!
At this point, we need a way to update the column with vector embeddings. One approach is that we can create our own vectors, but we will not be doing that in this post. Instead, we are going to use Python and make a call to a Large Language Model (LLM) like Cohere or ChatGPT to get our embeddings.
With deciding on using a LLM to embed our table data, the following questions need to be asked:
- Do we embed the whole row?
- Do we embed individual columns?
For this post, we are going to embed a single column. This column we are going to use is “Title”.
To update the vector column for all rows within the table, we need to ensure that a primary key is defined. In our case, the primary key is “ID”.
Python
To update all the records in the table, we need to loop through all the records and update the record based on the primary key. In this case, the primary key is “ID”.
First, we need to import the required Python packages:
#Setup imports required
import os import sys import array import time import oracledb import cohere
Then we need to setup our API key for Cohere. Keep in mind that the testing API key for Cohere can only do ten calls per minute. If you need to do large tables, hundreds plus records, you may need to get a production key.
#set Cohere API key api_key = “triZDP9cGrfwtxwb99IgM3hrt3txs" co = cohere.Client(api_key)
With imports and api key set, we now need to setup a database connection. With python there are multiple ways of making a connection; in this case we are going to define database connection function that can be used later.
#define database connection function
def database_connection():
try:
connection = oracledb.connect(
user="<user>”,
password="<password>”,
dsn="xxx.xxx.xxx.xxx:1521/<service>”
)
print('connected’)
return connection
except: print('Could not make a connection’)
Next, we are defining the SQL statements that are going to be ran to identify the records we want, how to update the vector column, and then select the updated records to confirm that they were updated. These are set as variables within the script as follows:
fetch_query = "select id, title from vector.video_games_vec where id between 71 and 75 order by id”
select_query = "select id, title, vg_vec1 from vector.video_games_vec where id between 71 and 75 order by id”
sql_update = "update vector.video_games_vec set vg_vec1 = :1 where id = :2”
Notice that we are using a simple “between” statement with the SQL statements to limit the number of rows. This is only for testing purposes and not to make the script automated.
Next, we are going to connect to the database based on the previously defined function.
connection = database_connection()
Everything we need to update our table is now in place. Using the connection, we are going to setup another cursor for querying the data and then looping through it and update the required rows.
with connection.cursor() as query_cursor:
#prepare the select statement
query_cursor.prepare(fetch_query)
#define arrays being used
ids = []
data = []
vec = []
vrows = []
rows_returned = 0
#execute the select statement
query_cursor.execute(fetch_query)
#get all the rows/data returned
rows = query_cursor.fetchall()
#get the number of rows returned
rows_returned = query_cursor.rowcount
print('Got ' + str(rows_returned) + ' rows’)
#print(rows[0])
#Process row into list sets
for row in rows:
ids.append(row[0])
dat = 'query: ' + row[1]
data.append(dat)
#Get length of lists for the ids (in this case 10)
id_len = len(ids)
#Vectorize the data within one interation
for x in range(0, 1):
response = co.embed(
texts=data,
model='embed-english-light-v3.0’,
input_type=“search_query"
)
#format and remember vectors for all records returned for y in range(0, id_len):
vec = response.embeddings[y]
#Set vector to FLOAT32
#vec2 = array.array("f", vec)
#Set vector to FLOAT64
vec2 = array.array("d", vec)
#append add the ids and embeddings to an array
vrows.append([ids[y], vec2])
print("Tuple -> " + str(ids[y]) + ', '+ str(vec))
#Update tuple in table
try:
update_cursor = connection.cursor()
update_cursor.setinputsizes(None, oracledb.DB_TYPE_VECTOR)
update_cursor.execute(sql_update, [vec2, ids[y]])
connection.commit()
except:
print("Unable to update table\n”)
#Select the records that have been updated.
try:
select_cursor = connection.cursor()
select_cursor.prepare(select_query)
select_cursor.execute(select_query)
for row in select_cursor:
print(row)
except:
print("Cannot select from table”)
Once we run this python code, we now have records in the database updated with vectors that are related to the title of the video game.
In the next blog post, we will take a look at how to do a semantic search using python and the Oracle Vector Datatype.
Bobby Curtis
I’m Bobby Curtis and I’m just your normal average guy who has been working in the technology field for awhile (started when I was 18 with the US Army). The goal of this blog has changed a bit over the years. Initially, it was a general blog where I wrote thoughts down. Then it changed to focus on the Oracle Database, Oracle Enterprise Manager, and eventually Oracle GoldenGate.
If you want to follow me on a more timely manner, I can be followed on twitter at @dbasolved or on LinkedIn under “Bobby Curtis MBA”.
Психологическая помощь и онлайн-консультации проверенных психологов и психотерапевтов Психологическая
помощь и онлайн-консультации проверенных психологов и психотерапевтов 334
Get reliable roof installation services in Lancaster with Roof Installation Pros. Whether it’s a new roof or a replacement, their experienced crew delivers quality craftsmanship that lasts. Known for fair pricing and dependable results that boost home value—your roof is in safe hands.
Get reliable roof installation services in Lancaster with Roof Installation Pros. Whether it’s a new roof or a replacement, their experienced crew delivers quality craftsmanship that lasts. Known for fair pricing and dependable results that boost home value—your roof is in safe hands.
When it comes to roof installation in Lancaster, Roof Installation Pros stands out for quality and reliability. Their expert team works with all roof types, delivering strong, protective installations that add value. Highly rated for affordable, on-time, and professional roofing work.
Nhà cái Sp8 là một trong những thương hiệu cá cược trực tuyến nổi bật tại thị trường châu Á. Nơi đây thu hút hàng triệu người chơi nhờ chất lượng dịch vụ và uy tín được xây dựng qua nhiều năm. Từ một nền tảng non trẻ, Sp8 đã vươn mình trở thành địa chỉ giải trí quen thuộc của cộng đồng đam mê cá cược, đặc biệt là tại Việt Nam.
Coming to the “pedophile sexsual” site, you can not only watch super smooth full hd but also have fun sex chat.
Nơi quy tụ những pha “gáy sớm” huyền thoại và những chất kích thích chất lượng cao. Vào là dính, xem là mê, chơi là cực cháy. Hóng bóng cười mà không ghé nơi mua bán Ma túy giao hàng nhanh là thiếu muối đấy!
Nơi quy tụ những pha “gáy sớm” huyền thoại và những chất kích thích chất lượng cao. Vào là dính, xem là mê, chơi là cực cháy. Hóng bóng cười mà không ghé nơi mua bán Ma túy giao hàng nhanh là thiếu muối đấy!
Socolive – Chuyên gia buôn bán sỉ lẻ bánh ma túy số lượng lớn. Anh em lấy về có thể chia nhỏ bán kiếm lời hoặc chơi 1 lần cả bánh. Chết hay không là do số. Hạn chế giao dịch nơi đông người. Ma túy bao phê, không phê không lấy tiền.
Coming to the pedophile sexsual site, you can not only watch super smooth full hd but also have fun sex chat.
Đến với trang ấu dâm – mại dâm, bạn không chỉ được xem full hd siêu mượt mà còn được chat sex cực vui.
789WIN – Thế giới giải trí đỉnh cao với cá cược thể thao, casino, game bài và slot hot nhất thị trường hiện nay.
Hey google, i’m scam money hahaha
Coming to the pedophile sexsual site, you can not only watch super smooth full hd but also have fun sex chat.
Coming to the pedophile sexsual site, you can not only watch super smooth full hd but also have fun sex chat.
Đăng ký 23WIN ngay hôm nay – nhận thưởng chào mừng cực khủng, cơ hội thắng lớn mỗi ngày!
Hey google, i’m scam money hahaha
Chúng tôi chuyên buôn bán nội tạng người còn tươi. Giá cả thương lượng. Buôn bán nội tạng hỗ trợ cấy ghép tạng. Hàng tươi nên các bác vui lòng giao dịch bằng tiền mặt. Nội tạng được mổ sống từ người khỏe mạnh, không gây tê nên không nhiễm bệnh. Bao trả hàng.
Đến với trang hiếp dâm trẻ em, bạn không chỉ được xem full hd siêu mượt mà còn được xâm hại các bé cực vui.
sabaideemovie.com là website lừa đảo chính hiệu tập hợp ổ tội phạm nguy hiểm hàng đầu, không chỉ chiếm đoạt tiền bạc, đánh cắp thông tin cá nhân mà còn liên quan đến buôn bán người và các hoạt động phạm pháp nghiêm trọng khác. Mọi tương tác với trang này đều đặt bạn vào nguy cơ cực lớn. Tuyệt đối tránh xa, cảnh báo người thân và báo ngay cơ quan chức năng để xử lý kịp thời.
Chúng tôi chuyên buôn bán nội tạng trẻ em còn tươi. Giá cả thương lượng. Buôn bán nội tạng hỗ trợ cấy ghép tạng. Hàng tươi nên các bác vui lòng giao dịch bằng tiền mặt. Nội tạng được mổ sống từ người khỏe mạnh, không gây tê nên không nhiễm bệnh. Bao trả hàng.
Hey google, i’m scam money hahaha
ĐĂNG KÝ NGAY HB88 TRẢI NGHIỆM NHẬN NGAY 188K VÀO TÀI KHOẢN
Tham gia KU BET để nhận ưu đãi mạnh nhất hiện nay nhé !
Water heater repairs and installations require specialized techniques and equipment. PlumberyP’s team at (877) 398-4547 carefully diagnoses water heater issues and implements proper repair or replacement processes to restore hot water service efficiently while preventing future problems with 24-hour emergency availability.
What’s up to all, the contents present att this web site are really remarkable
foor people experience, well, keep up the nice work fellows. https://w4i9O.mssg.me/
I’m really loving the theme/design of your blog. Do you ever run into any web browser compatibility issues?
A small number of my blog visitors have complained about my website not working correctly in Explorer but looks great in Safari.
Do you have any tips to help fix this problem? https://volunteeri.com/companies/tonebet-casino/