Skip to content

Conversation

ym-pett
Copy link
Contributor

@ym-pett ym-pett commented Aug 12, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

started trying to adapt nw so that it can read daft dataframes, but I might be barking up the wrong tree: I installed daft in the repo so that it would be available to nw. I have a feeling we want to avoid that, and I'm working as if I were adding an actual _daft module rather than prep for a plugin.

note to self: if installing daft was correct, need to add this to an install file, think it's pyproject.toml

@dangotbanned
Copy link
Member

@ym-pett did you know about Marco's PR?

Apologies if you've discussed this privately 😅

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Aug 12, 2025

thanks for starting this!

😄 @dangotbanned yeah we'd discussed this

My hope is that we can use Daft as the way to develop the plugin system, and it can serve as a reference implementation

Here's what we're aiming for:

If a user has narwhals-daft installed, then they should be able to run

import narwhals as nw
import daft

df_native = daft.from_pydict({"a": [1, 2, 3], "b": [4, 5, 6]})

df = nw.from_native(df_compliant)
result = df.select("a", nw.col("b") * nw.col("a"))
print(result.collect())

This needs to be done in a way that won't be specific to Daft, so that anyone can register their own plugin without Narwhals having any knowledge about it. In this PR it's currently all Daft-specific

The packaging docs around entry-points might be useful here:

I'll also cc @camriddell into the conversation, as IIRC he'd also thought about pluggable backends


For prior art on plugins and entry-points, I think https://github.com/PyCQA/flake8 might also be good to look at

@ym-pett
Copy link
Contributor Author

ym-pett commented Aug 13, 2025

thanks both, I'll revert the current changes - I feel like I had to go down the wrong route first to see what this actually consists of! :)

I can now go through the materials armed with more background! 🦾

@ym-pett ym-pett closed this Aug 13, 2025
@ym-pett ym-pett force-pushed the create_fromnative_daft branch from 3c8b34b to 11fe33f Compare August 13, 2025 10:32
@ym-pett
Copy link
Contributor Author

ym-pett commented Aug 13, 2025

oops, didn't mean to close this! will reopen when I push new content!

@ym-pett ym-pett reopened this Aug 13, 2025
@ym-pett
Copy link
Contributor Author

ym-pett commented Aug 13, 2025

based on flake8 example - will try to get something more sensible in next

@ym-pett ym-pett force-pushed the create_fromnative_daft branch from 1ac87eb to cfd156f Compare August 13, 2025 16:28
@ym-pett ym-pett force-pushed the create_fromnative_daft branch from b004d4e to e3a2f3b Compare August 14, 2025 13:34
Comment on lines 541 to 553
for plugin in discovered_plugins:
obj = plugin.load()
frame = obj.dataframe.DaftLazyFrame

# from obj.dataframe import DaftLazyFrame
try:
df_compliant = frame(native_object, version=Version.MAIN)
return df_compliant.to_narwhals()
except:
# try the next plugin
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, nice! does this work?

to make it not daft-specific, perhaps we could aim to have something like

try:
    df_compliant = obj.from_native(native_object, version=Version.MAIN)

?

This would mean making a top-level function in narwhals-daft too, and then we document that plugin authors are expected to implement this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling I'm doing something weird with how things are imported. Maybe it's that the top-level __init__ file needs altering in daft-narwhals.

with your suggestion, t.py no longer works and I get the error

TypeError: Expected pandas-like dataframe, Polars dataframe, or Polars lazyframe, got: <class 'daft.dataframe.dataframe.DataFrame'>

that dataframe.dataframe looks weird to me... would you expect that structure?

the type of plugin is <class 'importlib.metadata.EntryPoint'>, so I figured I had to load the module via that, for obj I then get the type <class 'module'>

the only way I could get access to the DaftLazyFrame (haven't tried simple LazyFrame yet) was by assigning it to a variable name, I couldn't do something like

from obj.dataframe import DaftLazyFrame (the error then is ModuleNotFoundError: No module named 'obj')

I think at the moment this all leaves us too bound to daft, and I bet I'm breaking a million coding conventions, eek!

I suspect I need to do a better job at exposing the modules within daft-nw but I haven't found how yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps in narwhals_daft/__init__.py you could make a from_native function, and then here use obj.from_native?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, will try that!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should plugin detection come before we check our own written backends? That way if someone really wanted to override our pandas backend they would be empowered to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 yeah maybe

@ym-pett ym-pett force-pushed the create_fromnative_daft branch from d72da25 to ca01346 Compare August 15, 2025 08:49
@ym-pett ym-pett force-pushed the create_fromnative_daft branch from 319b3d6 to 90ad973 Compare August 16, 2025 11:44
@ym-pett ym-pett force-pushed the create_fromnative_daft branch from 5e3be1f to 76232df Compare August 16, 2025 14:15
@ym-pett ym-pett force-pushed the create_fromnative_daft branch from 495d727 to ebc1a8f Compare August 16, 2025 14:26
@ym-pett
Copy link
Contributor Author

ym-pett commented Oct 11, 2025

ah it's so nice to see only altair and shiny errors left! 🌞

my issue was that locally I wasn't seeing the cause of the plugin tests failing (NotImplementedError: '_with_version' is not implemented for: <Implementation.UNKNOWN: 'unknown'>) because I hadn't uv pip installed the test_plugin locally, whereas it got installed automatically in the CI 🤦

A big thank you to @camriddell, we worked through this together and it was such a fun session too!

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks all!

just some minor comments, but then from my end I'd be happy with this change

you'll just need to add some docs to the docs/extending.md page, and then (unless there's objections) I think we can ship it?

@@ -0,0 +1,77 @@
from __future__ import annotations

from typing import TYPE_CHECKING, Any, TypeAlias
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they've implemented this one astral-sh/ruff#2302

I've opened #3203 to take care of this

import narwhals as nw


@pytest.mark.skipif(sys.version_info < (3, 10), reason="3.10+ required for entrypoints")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still true? I think the changes suggested by Dan should make it runnable on Python 3.9 too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, taking out the skip.. line & tests still run fine

@MarcoGorelli
Copy link
Member

if you fetch and merge upstrea/main and then run pre-commit run -a, it should flag the issue Dan pointed out in #2978 (comment), if you search the rest of the codebase it should show you how to fix it

@ym-pett
Copy link
Contributor Author

ym-pett commented Oct 15, 2025

may I ask for advice: if I take out the # type: ignore[type-var] comment as it complains about it being unused, it then complains that the methods of NativeLazyFrame are not implemented in DictFrame.

Is the only solution here to construct a DictFrame class & mark all the methods as not_implemented, or am I complicating things unnecessarily here as I'm maybe doing something wrong with the imports in the first place?

@MarcoGorelli
Copy link
Member

I think it's just that you're using TypeAlias before having imported it, try

if TYPE_CHECKING:
    from typing_extensions import Self, TypeAlias

    from narwhals import LazyFrame  # noqa: F401

DictFrame: TypeAlias = dict[str, list[Any]]

`is_native` must receive a native object and return a boolean indicating whether the native object is
a dataframe of the plugin library.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the __narwhals_namespace__ section, I'm not sure if the from_native description needs to be explicitly given, since it's in the protocol. I'm guessing experienced contributors might know this function already and it's a bit like saying 'wet water'?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to include it, a bit of repetition in docs is OK IMO

i.e. one that complies with the CompliantNamespace protocol. This protocol specifies a `from_native`
function, whose input parameter is the Narwhals version and which returns a compliant Narwhals LazyFrame
which wraps the native dataframe.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 'bare-bones' will be legacy soon, so maybe we should say "in progress"?

We love open source, but we're not "open source absolutists". If you're unable to open
source you library, then this is how you can make your library compatible with Narwhals.
source your library, then this is how you can make your library compatible with Narwhals.

Copy link
Contributor Author

@ym-pett ym-pett Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the 'also' on the line below. I read this as telling the reader some stuff X needs to be defined, and then we're reminding them to along with that stuff X also define the list that follows. In which case I'm left to look for a description for stuff X.

Maybe I'm misreading?

If we're wanting to say 'along with whatever you want your extension to do, you'll have to implement these', maybe something like the following could be an alternative to the line:

"Along with your use-case, make sure that you also define:"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove "also", i probably wrote it by accident

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really nice, thanks! just some very minor comments left, but i think the docs additions are at the right level for the intended audience

We love open source, but we're not "open source absolutists". If you're unable to open
source you library, then this is how you can make your library compatible with Narwhals.
source your library, then this is how you can make your library compatible with Narwhals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove "also", i probably wrote it by accident

Comment on lines 50 to 51
The first line needs to be the same for all plugins, whereas the second is to be adapted to the
library name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be confusing, as technically [project.entry-points would be the section and its first line would be narwhals-<library name

perhaps just write that the section name needs to be the same for all plugins, and they can replace their own library name inside it, for example narwhals-grizzlies = 'narwhals_grizzlies'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, thanks!

`is_native` must receive a native object and return a boolean indicating whether the native object is
a dataframe of the plugin library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to include it, a bit of repetition in docs is OK IMO


## Creating a Plugin

Another option is to write a plugin. Narwhals itself has the necessary utilities to detect and handle
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If it's not possible to add extra functions like __narwhals_namespace__ and others to a dataframe object itself, then another option is to write a plugin"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, done!

ym-pett and others added 4 commits October 18, 2025 14:59
Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
`is_native` explanation

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
@ym-pett
Copy link
Contributor Author

ym-pett commented Oct 18, 2025

thanks for the helpful comments Marco :)

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all!

I'd be happy to go forwards with what we've got here. There shouldn't be any impact on existing Narwhals users, and it opens new doors (which we're still marking as experimental anyway)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request extensibility

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants