Skip to content

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented Sep 3, 2025

Issues closed

Fixes #11525

Description of changes

Adds

  • ibis.sql_value()
  • ibis.IntoInterpolation and ibis.IntoTemplate for type annotations

The implementation is actually pretty straightforward.

A few open things I'm not sure about

  • Should we default to duckdb dialect when creating the op? We need to have SOME dialect chosen (None won't work), because we actually need to parse the sql string with sqlglot in order to introspect the ibis datatype of the op. We could potentially use ibis.get_backend() at op creation time? But that seemed too tricky? There might also be some way to infer the dialect using sqlglot? eg if we hand sqlglot "CAST(NULL AS REAL)" to sqlglot, we should be able to infer this is sqlite dialect?
  • currently, if you pass t"CAST({x} AS REAL)" (eg using sqlite syntax to sql_value() with dialect="duckdb", sqlglot is generous and still parses it. But perhaps we should be more strict and error?
  • I use property's a lot in the op.TemplateSQL class. But perhaps these should be made private.
  • The name of ibis.sql_value(). I went with this, instead of plain ibis.sql(), because I wanted it to be more explicit that this doesn't accept SELECT statements. But open to other names.
  • Open to other names for the op instead of TemplateSQL. But this is private to within ibis, so not a huge deal.
  • When using Table.sql(), the operation actually requires that the expression is bound to a backend, because we actually go fetch the schema of the resulting table by using TYPEOF(). This is very accurate, but does require a backend. In this PR, I don't require a backend, instead just doing some maybe-overly-tricky injecting CAST(NULL AS <dtype>) instead of the real value. eg ibis.sql_value(t("{num} + 1)) passes "CAST(NULL AS DOUBLE) + 1" to sqlglot to infer the datatype. Perhaps we could do some hybrid approach: if there is already a backend, then use that, otherwise fall back to sqlglot? Do we want these two APIs to use the same approach for consistency?

This doesn't actually support the original motivating example from the linked issue:

import ibis
timestamp = ibis.timestamp("2024-08-01 21:44:00")  # noqa: F841
in_ak_time = ibis.sql_value(t("{timestamp} AT TIME ZONE 'America/Anchorage'"))
# TypeError: Could not infer the dtype of the template expression with sql:
# CAST(NULL AS TIMESTAMP) AT TIME ZONE 'America/Anchorage'

because sqlglot doesn't have complete coverage for introspecting the datatypes of this weird syntax.

We could either

  • accept this as a limitation of upstream sqlglot, and get that implemented there
  • allow an optional type kwarg to the op.TemplateSQL (and the public ibis.sql_value() API). If that is given, then just use that as the dtype, and don't try to use sqlglot to introspect the dtype from the sql.

@github-actions github-actions bot added tests Issues or PRs related to tests sql Backends that generate SQL labels Sep 3, 2025
@NickCrews NickCrews requested a review from cpcloud September 3, 2025 19:05
@NickCrews NickCrews added feature Features or general enhancements compiler Issues or PRs related to compilation labels Sep 3, 2025
@NickCrews
Copy link
Contributor Author

All of the failing tests look to be unrelated, some issue with pins, possibly due to GCS access issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler Issues or PRs related to compilation feature Features or general enhancements sql Backends that generate SQL tests Issues or PRs related to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Value.sql(sql: str | string.Template, *, name: str | None = "{{value}}")

1 participant