Skip to content

.Net: Bug: GoogleAI IEmbeddingGenerator Ignores task_type in EmbeddingGenerationOptions #13250

@ChefBierfles31

Description

@ChefBierfles31

Describe the bug

When using the IEmbeddingGenerator implementation for Google AI (AddGoogleAIEmbeddingGenerator), providing a task_type within the EmbeddingGenerationOptions.AdditionalProperties does not add the taskType field to the outgoing HTTP request body sent to the Google API.

This prevents the use of task-specific embeddings (e.g., RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY), which is a critical feature for optimizing search and RAG applications. The generator silently discards the option, leading to default embeddings being generated for all tasks.


To Reproduce

The issue can be reproduced by intercepting the outgoing HttpClient request.

  1. Set up a logging handler:

    public class LoggingDelegatingHandler : DelegatingHandler
    {
        protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
        {
            if (request.Content != null)
            {
                var requestBody = await request.Content.ReadAsStringAsync(cancellationToken);
                Console.WriteLine($"--> REQUEST BODY:\n{requestBody}");
            }
            return await base.SendAsync(request, cancellationToken);
        }
    }
  2. Register the services:

    services.AddTransient<LoggingDelegatingHandler>();
    services.AddHttpClient("GoogleAIClient").AddHttpMessageHandler<LoggingDelegatingHandler>();
    
    // Register the generator, ensuring it uses the instrumented HttpClient
    services.AddGoogleAIEmbeddingGenerator(
        modelId: "embedding-001",
        apiKey: "YOUR_API_KEY",
        httpClient: services.BuildServiceProvider().GetRequiredService<IHttpClientFactory>().CreateClient("GoogleAIClient")
    );
  3. Execute the call with task_type:

    var embeddingGenerator = serviceProvider.GetRequiredService<IEmbeddingGenerator<string, Embedding<float>>>();
    var text = "This is a document for retrieval.";
    
    var options = new EmbeddingGenerationOptions()
    {
        AdditionalProperties = new() { { "task_type", "RETRIEVAL_DOCUMENT" } }
    };
    
    await embeddingGenerator.GenerateVectorAsync(text, options);
  4. Observe the logged request body:
    The output shows a request body without the taskType field:

    --> REQUEST BODY:
    {"requests":[{"model":"models/gemini-embedding-001","content":{"parts":[{"text":"This is a document for   retrieval."}]},"outputDimensionality":1998}]}

Expected behavior

The logged HTTP request body should include the taskType field, as specified in the Google API documentation.

Expected request body:

--> REQUEST BODY:
{"requests":[{"model":"models/gemini-embedding-001","content":{"parts":[{"text":"This is a document for retrieval."}]},"outputDimensionality":1998}]},"taskType":"RETRIEVAL_DOCUMENT"}]}

Platform

  • Language: C#
  • Source: Microsoft.SemanticKernel.Connectors.Google, version 1.66.0-alpha (and likely earlier versions)
  • AI model: Google gemini-embedding-001
  • IDE: Visual Studio 2022
  • OS: Windows

Additional context

The root cause appears to be that the EmbeddingGenerationOptions are not being passed down through the internal call stack.

  1. GoogleAIEmbeddingGenerator.GenerateAsync calls an internal generator.
  2. This internal generator is an adapter for the obsolete GoogleAITextEmbeddingGenerationService.
  3. The GenerateEmbeddingsAsync method on GoogleAITextEmbeddingGenerationService does not have a parameter for EmbeddingGenerationOptions, so the options are discarded at this point.
  4. Consequently, the deeper GoogleAIEmbeddingRequest.FromData method, which builds the request, is never supplied with the taskType value.

This functionality is crucial for building effective search systems, as the performance difference between default embeddings and specialized RETRIEVAL_DOCUMENT/RETRIEVAL_QUERY embeddings is significant.

Metadata

Metadata

Assignees

Labels

.NETIssue or Pull requests regarding .NET codebugSomething isn't working

Type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions