PostgreSQL integration with AI (Vector Search in PostgreSQL)
Based on Ignite 20225 – LAB515 Build Advanced Ai Agents with PostgreSQL.
Source Code: https://github.com/MariuszFerdyn/ignite25-LAB515-build-advanced-ai-agents-with-postgresql.
Transcript:
Source code: https://github.com/MariuszFerdyn/ignite25-LAB515-build-advanced-ai-agents-with-postgresql
— ============================================================================
— PART 1: Initialize Database and Verify Data
— ============================================================================— Load initial dataset from SQL script
\i ./Scripts/initialize_dataset.sql;— Enable auto-formatting for better readability in psql
\x auto— Verify that data was loaded correctly – show first 5 case names
SELECT name FROM cases LIMIT 5;— ============================================================================
— PART 2: Install and Configure Azure AI Extension
— ============================================================================— Check which Azure extensions are available in the server
SHOW azure.extensions;— Install the azure_ai extension to enable Azure OpenAI integration
— This extension provides functions to interact with Azure OpenAI Service
CREATE EXTENSION IF NOT EXISTS azure_ai;— Switch to PowerShell context to load environment variables
cd C:\Lab\Scripts\
.\get_env.ps1— Configure Azure OpenAI endpoint URL
— Replace {AZURE_OPENAI_ENDPOINT} with your actual endpoint from Azure Portal
SELECT azure_ai.set_setting(‘azure_openai.endpoint’, ‘{AZURE_OPENAI_ENDPOINT}’);— Configure Azure OpenAI API key for authentication
— Replace {AZURE_OPENAI_KEY} with your actual API key from Azure Portal
SELECT azure_ai.set_setting(‘azure_openai.subscription_key’, ‘{AZURE_OPENAI_KEY}’);— Verify the endpoint configuration was saved correctly
SELECT azure_ai.get_setting(‘azure_openai.endpoint’);— Verify the API key configuration was saved correctly
SELECT azure_ai.get_setting(‘azure_openai.subscription_key’);— ============================================================================
— PART 3: Test Embeddings Generation
— ============================================================================— Test embeddings creation with sample text
— text-embedding-3-small is the Azure OpenAI embedding model
— Show only first 100 characters of the resulting vector for preview
SELECT LEFT(azure_openai.create_embeddings(‘text-embedding-3-small’, ‘Sample text for PostgreSQL Lab’)::text, 100) AS vector_preview;— ============================================================================
— PART 4: Traditional Keyword Search (for comparison)
— ============================================================================— Perform traditional keyword-based search using ILIKE pattern matching
— This will miss semantically similar content that uses different words
SELECT id, name, opinion
FROM cases
WHERE opinion ILIKE ‘%Water leaking into the apartment from the floor above’;— ============================================================================
— PART 5: Prepare Table for Vector Search
— ============================================================================— Install pgvector extension to support vector data type and operations
CREATE EXTENSION IF NOT EXISTS vector;— Add a new column to store embeddings (1536 dimensions for text-embedding-3-small)
ALTER TABLE cases ADD COLUMN opinions_vector vector(1536);— Generate embeddings for all existing cases
— Concatenate name and first 8000 chars of opinion for better context
— max_attempts and retry_delay_ms handle API rate limiting gracefully
UPDATE cases
SET opinions_vector = azure_openai.create_embeddings(‘text-embedding-3-small’, name || LEFT(opinion, 8000), max_attempts => 5, retry_delay_ms => 500)::vector
WHERE opinions_vector IS NULL;— ============================================================================
— PART 6: Create Vector Index for Efficient Similarity Search
— ============================================================================— Install pg_diskann extension for high-performance vector indexing
CREATE EXTENSION IF NOT EXISTS pg_diskann;— Create DiskANN index for fast cosine similarity searches
— This dramatically improves performance for large datasets
CREATE INDEX cases_cosine_diskann ON cases USING diskann(opinions_vector vector_cosine_ops);— Verify embeddings were created – show one example vector
SELECT opinions_vector FROM cases LIMIT 1;— ============================================================================
— PART 7: Semantic Search Using Vector Similarity
— ============================================================================— Perform semantic search using cosine similarity (<=> operator)
— Find the 10 most semantically similar cases to the query
— This finds similar content even if exact words don’t match
SELECT
id, name
FROM
cases
ORDER BY opinions_vector <=> azure_openai.create_embeddings(‘text-embedding-3-small’, ‘Water leaking into the apartment from the floor above.’)::vector
LIMIT 10;— Get the most similar case with full opinion text
— Shows the best semantic match to the query
SELECT
id, opinion
FROM cases
ORDER BY opinions_vector <=> azure_openai.create_embeddings(‘text-embedding-3-small’, ‘Water leaking into the apartment from the floor above.’)::vector
LIMIT 1;— ============================================================================
— PART 8: Compare Semantic Search vs Keyword Search
— ============================================================================— Re-run the traditional keyword search for comparison
— Notice how semantic search (above) may find more relevant results
— that don’t contain the exact phrase
SELECT id, name, opinion
FROM cases
WHERE opinion ILIKE ‘%Water leaking into the apartment from the floor above’;
Video: