diff --git a/pages/ai-ecosystem.mdx b/pages/ai-ecosystem.mdx index 7b8617395..34b5952fe 100644 --- a/pages/ai-ecosystem.mdx +++ b/pages/ai-ecosystem.mdx @@ -30,6 +30,8 @@ This section of Memgraph’s documentation is your guide to using Memgraph for A - [GraphChat in Memgraph Lab](/memgraph-lab/features/graphchat): Explore how natural language querying (GraphChat) ties into the GraphRAG ecosystem, making complex graphs accessible to everyone. +- [Agents in Memgraph](/ai-ecosystem/agents): Discover how you can leverage AI + agents to automate graph modeling and migration tasks. \ No newline at end of file diff --git a/pages/ai-ecosystem/agents.mdx b/pages/ai-ecosystem/agents.mdx new file mode 100644 index 000000000..24e48998f --- /dev/null +++ b/pages/ai-ecosystem/agents.mdx @@ -0,0 +1,334 @@ +--- +title: Agents +description: Memgraph agents are build to help you build graph applications faster by leveraging the power of AI. +--- +import { Callout, Steps } from 'nextra/components' + + +# Memgraph Agents + +Memgraph Agents are specialized tools designed to streamline and enhance the +development of Graph applications. These agents leverage Large Language Models +(LLMs) to provide intelligent solutions for various graph-related tasks. By the +nature of technology maturity, some agents may be experimental and are +continuously evolving to better serve your needs. + +# SQL2Graph Agent + +## Overview + +The **SQL2Graph Agent** is an intelligent database migration agent that +transforms relational databases (MySQL, PostgreSQL) into graph databases using +AI-powered analysis. It leverages Large Language Models (LLMs) to understand the +semantics of your relational schema and generate an optimized property graph +model for Memgraph. The agent enables interactive modeling and refinement of the +graph schema, validation after the migration. + +**Key Capabilities:** + +1. **Automatic Database Migration**: Performs end-to-end migration from SQL to + graph with minimal user input. +2. **Interactive graph modeling**: Enables users to review and refine the + generated graph model incrementally, before executing the migration. +3. **Validation**: Provides pre- and post-migration validation to communicate + the quality and correctness of the migration. + + +The Agent supports two main **modeling strategies**: +1. **Deterministic Strategy**: Rule-based mapping of tables to nodes and foreign + keys to relationships. +2. **LLM Strategy**: AI-powered analysis using LLMs to generate a semantically + rich graph model. + +The agent can also be run in two **modes**: +1. **Automatic Mode**: Fully automated migration without user interaction. +2. **Incremental Mode**: Step-by-step review and refinement of the graph model + before migration. + +These are controlled via CLI flags and environment variables. + +## Supported Databases + +- **Source Databases**: PostgreSQL, MySQL +- **Target Database**: Memgraph + +## How To Use The Agent + +From this point onward, it is assumed that you have Memgraph installed and +running. If you haven't done so, please refer to the [Memgraph installation +guide](https://memgraph.com/docs/memgraph/installation). + +Just make sure to start Memgraph with the `--schema-info-enabled=true` flag to +enable schema information tracking: + +```bash +docker run -p 7687:7687 memgraph/memgraph --schema-info-enabled +``` + +It is also assumed that you have a running instance of either PostgreSQL or +MySQL with a sample database to migrate. + +### Installation + +In order to use you first need to clone the repository and install the +dependencies: + + +```bash +# Clone the repository +git clone https://github.com/memgraph/ai-toolkit + +# Navigate to the sql2graph directory +cd agents/sql2graph + +# Install dependencies using uv +uv pip install -e . +``` + +### Configuration + +The configuration enables you to control the agent flow via environment +variables. The key information needed are the source database connection +details, target Memgraph connection details, and LLM API keys and agent +configuration. + +Create a `.env` and fill the following variables: + +```bash +# Source Database +SOURCE_DB_TYPE=postgresql # or mysql + +# PostgreSQL Configuration +POSTGRES_HOST=localhost +POSTGRES_PORT=5432 +POSTGRES_DATABASE=mydb +POSTGRES_USER=username +POSTGRES_PASSWORD=password +POSTGRES_SCHEMA=public + +# MySQL Configuration (if using MySQL) +MYSQL_HOST=localhost +MYSQL_PORT=3306 +MYSQL_DATABASE=mydb +MYSQL_USER=username +MYSQL_PASSWORD=password + +# Target Memgraph Database +MEMGRAPH_URL=bolt://localhost:7687 +MEMGRAPH_USERNAME= +MEMGRAPH_PASSWORD= +MEMGRAPH_DATABASE=memgraph + +# LLM API Keys (for AI-powered features) +# Only provide the key for your chosen provider +OPENAI_API_KEY=sk-... # For GPT models +# ANTHROPIC_API_KEY=sk-ant-... # For Claude models +# GOOGLE_API_KEY=AI... # For Gemini models + +# Optional: Specify LLM model (defaults shown) +# LLM_MODEL=gpt-4o-mini # OpenAI default +# LLM_MODEL=claude-3-5-sonnet-20241022 # Anthropic default +# LLM_MODEL=gemini-2.0-flash-exp # Google default + +# Migration Defaults (can be overridden via CLI flags) +SQL2MG_MODE=automatic # Options: automatic, incremental +SQL2MG_STRATEGY=deterministic # Options: deterministic, llm +SQL2MG_META_POLICY=auto # Options: auto, reset, skip +SQL2MG_LOG_LEVEL=INFO + +``` + +There is an .env.example file in the `agents/sql2graph` directory that you can +use as a template. + +**Important**: Ensure Memgraph is started with `--schema-info-enabled=true` for +full functionality. + +#### Quick Start - Automatic Migration + +Run with default settings (automatic mode, deterministic strategy): + +```bash +uv run main.py +``` + +The agent will: +1. Validate your environment and database connections +2. Analyze the source database schema +3. Generate a complete graph model +4. Execute the migration +5. Validate the results + +In this mode, no user interaction is required, and the entire process is +automated. This means the `SQL2MG_MODE` is set to `automatic`, and the +SQL2MG_STRATEGY is set to `deterministic`. `SQL2MG_MODE` refers **modeling +mode** and represents how much user interaction is involved, while +`SQL2MG_STRATEGY` refers to how the graph model is generated. + + +#### Refinement with Incremental Mode + +For more control, run in incremental mode to review and refine the model +step-by-step: + +```bash +uv run main.py --mode incremental +``` +The agent will: +1. Analyze the source database schema +2. Generate an initial graph model +3. Present each table's proposed transformation for review +4. Allow you to accept, skip, or modify each table's mapping +5. After reviewing all tables, optionally enter a refinement loop for final + adjustments +6. Execute the migration +7. Validate the results + +This is predictable and repeatable flow so you can iteratively improve the graph +model before migration. Each table is processed one at a time, and you have full +control over the transformations, the table will show you all the proposed +nodes, relationships, and properties for that table, and you can choose to +accept them as-is, skip the table entirely, or modify the mapping details. + + +#### Interactive Migration with LLM + +Use LLM-powered modeling for AI driven design: + +```bash +uv run main.py --strategy llm +``` + +The agent auto-detects which LLM provider to use based on available API keys. In +this strategy, the agent will: +1. Analyze your SQL schema semantically using LLM +2. Generate an initial graph model with AI-optimized design +3. Execute the migration +4. Validate the results + +Keep in mind that in this mode, the entire migration is still automatic and LLM +driven. + +#### Incremental Migration with Review + +Control each step of the transformation: + +```bash +uv run main.py --mode incremental --strategy llm +``` + +In incremental mode: +1. The AI generates a complete graph model for all tables +2. You review each table's mapping one at a time +3. Accept or modify individual table transformations +4. After processing all tables, optionally enter a refinement loop +5. Interactively adjust the entire model before final migration + +In this mode the LLM is used to generate the initial model, but you have full +control to review and refine each table's mapping before migration. After each +modification, the LLM will try to regenerate based on your feedback and +validation errors to improve the model iteratively. + +## CLI Reference + +### Command-Line Options + +| Flag | Environment Variable | Description | Default | +|------|---------------------|-------------|---------| +| `--mode` | `SQL2MG_MODE` | `automatic` or `incremental` | interactive prompt | +| `--strategy` | `SQL2MG_STRATEGY` | `deterministic` or `llm` | interactive prompt | +| `--provider` | _(none)_ | `openai`, `anthropic`, or `gemini` | auto-detect from API keys | +| `--model` | `LLM_MODEL` | Specific model name | provider default | +| `--meta-graph` | `SQL2MG_META_POLICY` | `auto`, `skip`, or `reset` | `auto` | +| `--log-level` | `SQL2MG_LOG_LEVEL` | `DEBUG`, `INFO`, `WARNING`, `ERROR` | `INFO` | + +### Usage Examples + +```bash +# Use specific Gemini model +uv run main.py --strategy llm --provider gemini --model gemini-2.0-flash-exp + +# Skip meta-graph comparison (treat as fresh migration) +uv run main.py --meta-graph skip + +# Enable debug logging +uv run main.py --log-level DEBUG + +# Fully configured non-interactive run +uv run main.py \ + --mode automatic \ + --strategy deterministic \ + --meta-graph reset \ + --log-level INFO +``` + + +## LLM Provider Support + +| Provider | Models | +|----------|--------| +| **OpenAI** | GPT-4o, GPT-4o-mini | +| **Anthropic** | Claude 3.5 Sonnet | +| **Google** | Gemini 2.0 Flash | + +### Provider Selection + +The agent automatically selects a provider based on available API keys: +1. Checks for `OPENAI_API_KEY` +2. Falls back to `ANTHROPIC_API_KEY` +3. Falls back to `GOOGLE_API_KEY` + +Override with `--provider` flag: + +```bash +# Force Anthropic even if OpenAI key exists +uv run main.py --strategy llm --provider anthropic +``` + +### Model Selection + +Each provider has sensible defaults: +- **OpenAI**: `gpt-4o-mini` +- **Anthropic**: `claude-3-5-sonnet-20241022` +- **Google**: `gemini-2.0-flash-exp` + +Override with `--model` or `LLM_MODEL` env variable: + +```bash +# Use more powerful OpenAI model +uv run main.py --strategy llm --model gpt-4o + +# Or via environment variable +export LLM_MODEL=claude-3-opus-20240229 +uv run main.py --strategy llm --provider anthropic +``` + + +## Architecture Overview + +If you like the implementation details, here is a high-level overview of the +project structure: + +``` +sql2graph/ +├── main.py # CLI entry point +├── core/ +│ ├── migration_agent.py # Main orchestration +│ └── hygm/ # Graph modeling engine +│ ├── hygm.py # HyGM core +│ ├── models/ # Data models +│ ├── strategies/ # Modeling strategies +│ └── validation/ # Validation system +├── database/ +│ ├── analyzer.py # Schema analysis +│ ├── factory.py # Database adapter factory +│ └── adapters/ # DB-specific adapters +├── query_generation/ +│ ├── cypher_generator.py # Cypher query builder +│ └── schema_utilities.py # Schema helpers +└── utils/ + ├── environment.py # Env validation + └── config.py # Configuration +``` + + diff --git a/pages/data-migration.mdx b/pages/data-migration.mdx index c3478a1b4..addda931a 100644 --- a/pages/data-migration.mdx +++ b/pages/data-migration.mdx @@ -29,6 +29,12 @@ In order to learn all the pre-requisites for importing data into Memgraph, check + +If you have a SQL data model and +want to migrate to Memgraph, you can try out our [Agent](/ai-ecosystem/agents) +that leverages the LLM to automate the process of modeling and migration. + + ## File types ### CSV files diff --git a/pages/data-migration/migrate-from-rdbms.mdx b/pages/data-migration/migrate-from-rdbms.mdx index 960e64abf..d7993b93f 100644 --- a/pages/data-migration/migrate-from-rdbms.mdx +++ b/pages/data-migration/migrate-from-rdbms.mdx @@ -32,6 +32,12 @@ highly connected and require frequent retrieval with a flexible data model. If you're seeking a quick and reliable database that allows effortless modifications of data model and properties, a graph database is the way to go. + +If you have a SQL data model and +want to migrate to Memgraph, you can try out our [Agent](/ai-ecosystem/agents) +that leverages the LLM to automate the process of modeling and migration. It supports + + ## Prerequisites To follow along, you will need: diff --git a/pages/data-modeling.mdx b/pages/data-modeling.mdx index 0f11ee65a..bc76b9f50 100644 --- a/pages/data-modeling.mdx +++ b/pages/data-modeling.mdx @@ -5,6 +5,7 @@ description: Learn to model graphs effectively with Memgraph. A documentation de import { Cards } from 'nextra/components' import {CommunityLinks} from '/components/social-card/CommunityLinks' +import { Callout } from 'nextra/components' # Introduction to graph data modeling @@ -60,6 +61,12 @@ data in Memgraph. as overcomplicating models, duplicating data, and neglecting indexing, and explains how to avoid them. + + If you have a SQL data model and + want to migrate to Memgraph, you can try out our [Agent](/ai-ecosystem/agents) + that leverages the LLM to automate the process of modeling and migration. + + ## Need help with data modeling? Schedule a 30 min session with one of our engineers to discuss how Memgraph fits