-
Notifications
You must be signed in to change notification settings - Fork 168
feat: introduce (experimental) plugin system #2978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@ym-pett did you know about Marco's PR? Apologies if you've discussed this privately 😅 |
thanks for starting this! 😄 @dangotbanned yeah we'd discussed this My hope is that we can use Daft as the way to develop the plugin system, and it can serve as a reference implementation Here's what we're aiming for: If a user has narwhals-daft installed, then they should be able to run import narwhals as nw
import daft
df_native = daft.from_pydict({"a": [1, 2, 3], "b": [4, 5, 6]})
df = nw.from_native(df_compliant)
result = df.select("a", nw.col("b") * nw.col("a"))
print(result.collect()) This needs to be done in a way that won't be specific to Daft, so that anyone can register their own plugin without Narwhals having any knowledge about it. In this PR it's currently all Daft-specific The packaging docs around entry-points might be useful here:
I'll also cc @camriddell into the conversation, as IIRC he'd also thought about pluggable backends For prior art on plugins and entry-points, I think https://github.com/PyCQA/flake8 might also be good to look at |
thanks both, I'll revert the current changes - I feel like I had to go down the wrong route first to see what this actually consists of! :) I can now go through the materials armed with more background! 🦾 |
3c8b34b
to
11fe33f
Compare
oops, didn't mean to close this! will reopen when I push new content! |
based on flake8 example - will try to get something more sensible in next |
1ac87eb
to
cfd156f
Compare
b004d4e
to
e3a2f3b
Compare
narwhals/translate.py
Outdated
for plugin in discovered_plugins: | ||
obj = plugin.load() | ||
frame = obj.dataframe.DaftLazyFrame | ||
|
||
# from obj.dataframe import DaftLazyFrame | ||
try: | ||
df_compliant = frame(native_object, version=Version.MAIN) | ||
return df_compliant.to_narwhals() | ||
except: | ||
# try the next plugin | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, nice! does this work?
to make it not daft-specific, perhaps we could aim to have something like
try:
df_compliant = obj.from_native(native_object, version=Version.MAIN)
?
This would mean making a top-level function in narwhals-daft too, and then we document that plugin authors are expected to implement this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling I'm doing something weird with how things are imported. Maybe it's that the top-level __init__
file needs altering in daft-narwhals.
with your suggestion, t.py no longer works and I get the error
TypeError: Expected pandas-like dataframe, Polars dataframe, or Polars lazyframe, got: <class 'daft.dataframe.dataframe.DataFrame'>
that dataframe.dataframe looks weird to me... would you expect that structure?
the type of plugin
is <class 'importlib.metadata.EntryPoint'>
, so I figured I had to load the module via that, for obj I then get the type <class 'module'>
the only way I could get access to the DaftLazyFrame (haven't tried simple LazyFrame yet) was by assigning it to a variable name, I couldn't do something like
from obj.dataframe import DaftLazyFrame
(the error then is ModuleNotFoundError: No module named 'obj'
)
I think at the moment this all leaves us too bound to daft, and I bet I'm breaking a million coding conventions, eek!
I suspect I need to do a better job at exposing the modules within daft-nw but I haven't found how yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps in narwhals_daft/__init__.py
you could make a from_native
function, and then here use obj.from_native
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, will try that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should plugin detection come before we check our own written backends? That way if someone really wanted to override our pandas
backend they would be empowered to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 yeah maybe
d72da25
to
ca01346
Compare
319b3d6
to
90ad973
Compare
5e3be1f
to
76232df
Compare
495d727
to
ebc1a8f
Compare
ah it's so nice to see only altair and shiny errors left! 🌞 my issue was that locally I wasn't seeing the cause of the plugin tests failing ( A big thank you to @camriddell, we worked through this together and it was such a fun session too! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks all!
just some minor comments, but then from my end I'd be happy with this change
you'll just need to add some docs to the docs/extending.md
page, and then (unless there's objections) I think we can ship it?
@@ -0,0 +1,77 @@ | |||
from __future__ import annotations | |||
|
|||
from typing import TYPE_CHECKING, Any, TypeAlias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think they've implemented this one astral-sh/ruff#2302
I've opened #3203 to take care of this
tests/plugins_test.py
Outdated
import narwhals as nw | ||
|
||
|
||
@pytest.mark.skipif(sys.version_info < (3, 10), reason="3.10+ required for entrypoints") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still true? I think the changes suggested by Dan should make it runnable on Python 3.9 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, taking out the skip.. line & tests still run fine
if you fetch and merge upstrea/main and then run |
may I ask for advice: if I take out the Is the only solution here to construct a DictFrame class & mark all the methods as not_implemented, or am I complicating things unnecessarily here as I'm maybe doing something wrong with the imports in the first place? |
I think it's just that you're using if TYPE_CHECKING:
from typing_extensions import Self, TypeAlias
from narwhals import LazyFrame # noqa: F401
DictFrame: TypeAlias = dict[str, list[Any]] |
`is_native` must receive a native object and return a boolean indicating whether the native object is | ||
a dataframe of the plugin library. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the __narwhals_namespace__
section, I'm not sure if the from_native
description needs to be explicitly given, since it's in the protocol. I'm guessing experienced contributors might know this function already and it's a bit like saying 'wet water'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to include it, a bit of repetition in docs is OK IMO
i.e. one that complies with the CompliantNamespace protocol. This protocol specifies a `from_native` | ||
function, whose input parameter is the Narwhals version and which returns a compliant Narwhals LazyFrame | ||
which wraps the native dataframe. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe 'bare-bones' will be legacy soon, so maybe we should say "in progress"?
We love open source, but we're not "open source absolutists". If you're unable to open | ||
source you library, then this is how you can make your library compatible with Narwhals. | ||
source your library, then this is how you can make your library compatible with Narwhals. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the 'also' on the line below. I read this as telling the reader some stuff X needs to be defined, and then we're reminding them to along with that stuff X also define the list that follows. In which case I'm left to look for a description for stuff X.
Maybe I'm misreading?
If we're wanting to say 'along with whatever you want your extension to do, you'll have to implement these', maybe something like the following could be an alternative to the line:
"Along with your use-case, make sure that you also define:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove "also", i probably wrote it by accident
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really nice, thanks! just some very minor comments left, but i think the docs additions are at the right level for the intended audience
We love open source, but we're not "open source absolutists". If you're unable to open | ||
source you library, then this is how you can make your library compatible with Narwhals. | ||
source your library, then this is how you can make your library compatible with Narwhals. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove "also", i probably wrote it by accident
docs/extending.md
Outdated
The first line needs to be the same for all plugins, whereas the second is to be adapted to the | ||
library name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be confusing, as technically [project.entry-points
would be the section and its first line would be narwhals-<library name
perhaps just write that the section name needs to be the same for all plugins, and they can replace their own library name inside it, for example narwhals-grizzlies = 'narwhals_grizzlies'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, thanks!
`is_native` must receive a native object and return a boolean indicating whether the native object is | ||
a dataframe of the plugin library. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to include it, a bit of repetition in docs is OK IMO
docs/extending.md
Outdated
|
||
## Creating a Plugin | ||
|
||
Another option is to write a plugin. Narwhals itself has the necessary utilities to detect and handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If it's not possible to add extra functions like __narwhals_namespace__
and others to a dataframe object itself, then another option is to write a plugin"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done!
Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
`is_native` explanation Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
thanks for the helpful comments Marco :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all!
I'd be happy to go forwards with what we've got here. There shouldn't be any impact on existing Narwhals users, and it opens new doors (which we're still marking as experimental anyway)
What type of PR is this? (check all applicable)
Related
plugins
uno reverse card ym-pett/narwhals#2CompliantNamespace.is_native
#3130Checklist
If you have comments or can explain your changes, please do so below
started trying to adapt
nw
so that it can read daft dataframes, but I might be barking up the wrong tree: I installed daft in the repo so that it would be available to nw. I have a feeling we want to avoid that, and I'm working as if I were adding an actual_daft
module rather than prep for a plugin.note to self: if installing daft was correct, need to add this to an install file, think it's pyproject.toml