Embeddings improvements #686

mattkjames7 · 2025-10-27T14:38:40Z

Add dimension as return value
Remove the compute_ prefix from the procedure names
Add embeddings.model_info(configuration) -> Map
(Breaking) Renamed compute() - > node_sentence()
(Breaking) Simplified node_sentence() function arguments such that it now only takes the optional list of nodes and an optional configuration map, e.g.

WITH {device: "cuda:0"} as configuration
CALL embeddings.node_sentence(NULL, configuration)
YIELD success, embeddings, dimension
RETURN success, embeddings, dimension;

Added dimension output to node_sentence() to show what the length of the embedding is for the current model.
Added option to return embeddings list from node_sentence() alongside the success parameter (using return_embeddings parameter inside the configuration map).
Added text() function for computing embeddings directly on lists of strings, e.g.:

WITH {device: "cuda:0"} as configuration
CALL embeddings.text(["Extra", "Cheese", "Please"], configuration)
YIELD success, embeddings, dimension
RETURN success, embeddings, dimension;

Added model_info() procedure to return information about the model being used as a Map, e.g.:

WITH {model_name: "all-MiniLM-L6-v2"} AS configuration
CALL embeddings.model_info(configuration)
YIELD info
RETURN info;

Fixed default device selection. Previously, the device was set to 0 (i.e. cuda:0), which meant that CPU-only image users would have to manually specify that the embeddings should be computed on CPU. Now, if no device is specified, the module will check if CUDA is available; if so: it will use the first available (cuda:0); otherwise it will fallback to CPU.

…pick CPU/first GPU depending upon availability

mattkjames7 · 2025-10-27T16:38:46Z

sonarqubecloud · 2025-10-28T12:46:08Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

- [x] Add `dimension` as return value - [x] Remove the `compute_` prefix from the procedure names - [x] Add `embeddings.model_info(configuration) -> Map` - (Breaking) Renamed `compute()` - > `node_sentence()` - (Breaking) Simplified `node_sentence()` function arguments such that it now only takes the optional list of nodes and an optional configuration map, e.g. ```cypher WITH {device: "cuda:0"} as configuration CALL embeddings.node_sentence(NULL, configuration) YIELD success, embeddings, dimension RETURN success, embeddings, dimension; ```` - Added `dimension` output to `node_sentence()` to show what the length of the embedding is for the current model. - Added option to return embeddings list from `node_sentence()` alongside the `success` parameter (using `return_embeddings` parameter inside the configuration map). - Added `text()` function for computing embeddings directly on lists of strings, e.g.: ```cypher WITH {device: "cuda:0"} as configuration CALL embeddings.text(["Extra", "Cheese", "Please"], configuration) YIELD success, embeddings, dimension RETURN success, embeddings, dimension; ``` - Added `model_info()` procedure to return information about the model being used as a `Map`, e.g.: ```cypher WITH {model_name: "all-MiniLM-L6-v2"} AS configuration CALL embeddings.model_info(configuration) YIELD info RETURN info; ``` - Fixed default `device` selection. Previously, the device was set to `0` (i.e. `cuda:0`), which meant that CPU-only image users would have to manually specify that the embeddings should be computed on CPU. Now, if no device is specified, the module will check if CUDA is available; if so: it will use the first available (`cuda:0`); otherwise it will fallback to CPU.

mattkjames7 added 3 commits October 27, 2025 11:22

Added configuration map, set default device to None -> automatically …

39d1eee

…pick CPU/first GPU depending upon availability

added ability to return a list of embeddings

9bfc530

added failing embed function

98d2a39

mattkjames7 self-assigned this Oct 27, 2025

mattkjames7 added Docs needed Docs needed feature feature labels Oct 27, 2025

use as a procedure for now

73c59ae

mattkjames7 mentioned this pull request Oct 27, 2025

Embedding improvements memgraph/documentation#1450

Merged

11 tasks

mattkjames7 added 2 commits October 27, 2025 16:24

update tests

5be92e0

fix some formatting issues£

f92db46

fix mistakes in updated tests

e4d7227

mattkjames7 added the breaking label Oct 27, 2025

mattkjames7 requested a review from gitbuda October 27, 2025 16:54

mattkjames7 marked this pull request as ready for review October 27, 2025 16:55

mattkjames7 added 2 commits October 27, 2025 16:55

new line

0b69093

updated function names

1744acf

gitbuda approved these changes Oct 27, 2025

View reviewed changes

mattkjames7 added 2 commits October 28, 2025 10:50

remove prefix

becfb87

added dimension output

4e91052

mattkjames7 added this to the mage-v3.7.0 milestone Oct 28, 2025

make configuration optional

308b327

mattkjames7 added this pull request to the merge queue Oct 29, 2025

Merged via the queue into main with commit 1a47072 Oct 29, 2025
29 checks passed

mattkjames7 deleted the embeddings-improvements branch October 29, 2025 11:45

mattkjames7 modified the milestones: mage-v3.7.0, mage-v3.6.2 Nov 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embeddings improvements #686

Embeddings improvements #686

Uh oh!

mattkjames7 commented Oct 27, 2025 •

edited

Loading

Uh oh!

mattkjames7 commented Oct 27, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Embeddings improvements #686

Embeddings improvements #686

Uh oh!

Conversation

mattkjames7 commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattkjames7 commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Pull request type

Related issues

Reviewer checklist (the reviewer checks this part)

Module/Algorithm

Documentation checklist

Uh oh!

sonarqubecloud bot commented Oct 28, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mattkjames7 commented Oct 27, 2025 •

edited

Loading

mattkjames7 commented Oct 27, 2025 •

edited

Loading