Setup Guide: pgvector with Docker

Have you ever wondered how Netflix suggests movies you might like, or how Spotify creates personalized playlists? These features often use something called “vector similarity search” – a powerful way to find related content. In this guide, we’ll set up…


This content originally appeared on DEV Community and was authored by yukaty

Have you ever wondered how Netflix suggests movies you might like, or how Spotify creates personalized playlists? These features often use something called "vector similarity search" - a powerful way to find related content. In this guide, we'll set up a PostgreSQL database with pgvector extension using Docker to build similar features.

Contents

  • What is Vector Search?
  • Prerequisites
  • Step-by-Step Setup
  • Troubleshooting
  • Next Steps

What is Vector Search?

When AI analyzes content (text, images, or products), it creates a special list of numbers (called a "vector" or "embedding") that represents that item's characteristics. Similar items will have similar numbers. pgvector helps us store and search these numbers efficiently.

If you're not familiar with Machine Learning, don't worry! You can easily obtain these embeddings from popular AI APIs like OpenAI's API, even without deep AI knowledge. These embeddings are the building blocks for creating recommendation engines and similarity search features.

Let's get started! 🚀

Prerequisites

Make sure you have Docker Desktop installed on your computer.

Step-by-Step Setup

1. Create docker-compose.yml

Create a docker-compose.yml file in your project root to define the PostgreSQL container.

services:
  db:
    image: pgvector/pgvector:pg17 # PostgreSQL with pgvector support
    container_name: pgvector-db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
      POSTGRES_DB: example_db
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql

volumes:
  pgdata: # Stores data outside the container to ensure persistence

2. Define Database Schema (schema.sql)

Create the postgres directory in the project root, and then create a schema.sql file to define your initial schema. This example schema enables pgvector extension and creates a table for storing items with vector embeddings.

-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create example table
CREATE TABLE items (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    metadata JSONB,
    embedding vector(1536) -- vector data
);

3. Start Docker Compose

Run Docker Compose to build and start the PostgreSQL container with pgvector.

docker compose up --build

4. Verify the Database and Extensions

Once the container is running, connect to PostgreSQL to verify the setup.

docker exec -it pgvector-db psql -U postgres -d example_db

In the PostgreSQL shell, run:

-- Check installed extensions
\dx

-- Check if your table exists
\dt

Using Your Vector Database

Here's a simple example of how to find similar items:

-- Find items similar to a specific vector
SELECT id, name, metadata
FROM items
ORDER BY embedding <-> '[0.1, 0.2, ...]'::vector
LIMIT 5;

Replace [0.1, 0.2, ...] with your actual vector from an AI service like OpenAI.

Troubleshooting

Error: Port 5432 already in use

Change the port in docker-compose.yml to 5433 or another free port.

Can't connect to database

Check if the container is up.

  docker ps

Database not initializing properly

Remove the volume and restart.

  docker-compose down -v    # Remove existing volume
  docker-compose up --build # Start fresh

No idea what's wrong

Check the container logs.

  docker compose logs db

Next Steps

Now that your vector database is set up, you can:

  • Generate embeddings using AI services like OpenAI
  • Store your data with its embeddings
  • Build search features that find similar items

Resources

Spot any mistakes or have a better way? Please leave a comment below! 🙌


This content originally appeared on DEV Community and was authored by yukaty


Print Share Comment Cite Upload Translate Updates
APA

yukaty | Sciencx (2024-11-05T21:03:40+00:00) Setup Guide: pgvector with Docker. Retrieved from https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/

MLA
" » Setup Guide: pgvector with Docker." yukaty | Sciencx - Tuesday November 5, 2024, https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/
HARVARD
yukaty | Sciencx Tuesday November 5, 2024 » Setup Guide: pgvector with Docker., viewed ,<https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/>
VANCOUVER
yukaty | Sciencx - » Setup Guide: pgvector with Docker. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/
CHICAGO
" » Setup Guide: pgvector with Docker." yukaty | Sciencx - Accessed . https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/
IEEE
" » Setup Guide: pgvector with Docker." yukaty | Sciencx [Online]. Available: https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/. [Accessed: ]
rf:citation
» Setup Guide: pgvector with Docker | yukaty | Sciencx | https://www.scien.cx/2024/11/05/setup-guide-pgvector-with-docker/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.