This repository provides a Docker image for PostgreSQL, pre-configured with powerful extensions tailored for data science, analytics, and AI/ML workloads.
The image is built in two stages:
- Building PostgreSQL Extensions (e.g.,
pg_parquetfrom CrunchyData). - Final PostgreSQL Image with compiled extensions and additional PostgreSQL packages.
PostgreSQL is a powerful data processing engine, and with the right extensions, it can serve as a feature store for AI/ML models, an analytics database, and a scalable data warehouse.
This image includes:
- Vector search for AI applications (
pgvector) - Time-series analysis for real-time data (
TimescaleDB) - Efficient JSON querying for structured/unstructured data (
jsquery) - Partitioning for large datasets (
pg_partman) - Query optimization & hinting (
pg-hint-plan) - Advanced auditing for compliance/security (
pgaudit) - Job scheduling inside PostgreSQL (
pg_cron) - Graph database support for network analysis (
AGE) - Event queueing system for real-time data streaming (
pgq3) - Advanced regex support with Perl-compatible regex (
pgpcre) - Built-in Parquet file format support (
pg_parquet)
⚠️ Note: Buildingpg_parquetis resource-intensive and may take a long time.
- PostgreSQL
- Optimized for AI, data science, and analytics
- Pre-installed PostgreSQL extensions:
pg_parquet(Parquet file format support)postgresql-17-age(Graph database support)postgresql-17-cron(Job scheduling within PostgreSQL)postgresql-17-jsquery(Advanced JSON querying)postgresql-17-partman(Automated table partitioning)postgresql-17-pg-hint-plan(Query optimizer hints)postgresql-17-pgaudit&postgresql-17-pgauditlogtofile(Audit logging)postgresql-17-pgpcre(Perl-compatible regular expressions)postgresql-17-pgq3(Queueing system for PostgreSQL)postgresql-17-pgvector(Vector search for AI/ML workloads)postgresql-17-prefix(Prefix search optimization)postgresql-17-timescaledb(Time-series database functionality)
- Supports Rust-based PostgreSQL extensions via
pgrx. - Built-in health check (
pg_isready).
docker pull rbehzadan/postgresTo manually build the image:
docker build -t my-postgres .pg_parquet requires significant CPU and memory resources, as it compiles from source.
Start a PostgreSQL container with this image:
docker run -d --name postgres \
-e POSTGRES_USER=admin \
-e POSTGRES_PASSWORD=secret \
-e POSTGRES_DB=mydb \
-p 5432:5432 \
rbehzadan/postgresConnect to the database using psql:
docker exec -it postgres psql -U admin -d mydbOnce connected, enable any installed extension:
CREATE EXTENSION pg_parquet;
CREATE EXTENSION timescaledb;
CREATE EXTENSION pgvector;
CREATE EXTENSION age;
CREATE EXTENSION pgq3;
CREATE EXTENSION pgpcre;Verify the container's health status:
docker inspect --format='{{json .State.Health}}' postgresIf you find a bug, want to add an extension, or improve the build process, feel free to open an issue or submit a pull request.
This project is licensed under the MIT License.
