[CI] Migrate to pyproject.toml and use poetry for deterministic builds #201

john-bodley · 2022-07-18T17:11:12Z

Somewhat of a yak shaving exercise as I was initially going to look into #199 though realized that the trino-python-client package has no deterministic dependency management. See the Slack discussion here for more details.

There's a slew of frameworks for handling Python dependencies including Pipenv, pip-tools, etc. however it seems like Poetry is well supported and integrates with pyproject.toml—a replacement for the legacy setup.py. Per #201 (comment), the importance is having a dependency manager rather than the dependency manager. If necessary is should be fairly trivial to later migrate to hatch or similar.

This PR:

Deprecates the setup.py in favor of pyproject.toml.
Adds Poetry for ensuring deterministic dependency management which generates a completely exhaustive frozen set of dependencies in the poetry.lock file.
Updates the Github workflows to invoke tox—ensuring a consistent local and remote CI workflow.

to: @bitsondatadev @hashhar

cla-bot · 2022-07-18T17:11:14Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T17:39:24Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T17:53:44Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T18:09:39Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T18:15:01Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T18:16:06Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-18T20:00:06Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-19T06:43:22Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-21T00:05:42Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-21T00:38:37Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-21T00:50:39Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

hashhar · 2022-07-22T06:09:30Z

.github/workflows/ci.yml

-      - name: "Install pre-commit"
-        run: pip install pre-commit
-
-      - name: "Run pre-commit checks"
-        run: pre-commit run --all-files


We generally keep linting and static analysis jobs separate from jobs which run tests. Makes it easier to see what failed.

It looks like now pre-commit runs as part of the test jobs themselves.

@hashhar the reason is to try to adhere to the DRY principle and ensure that local and remote CI are consistent. In this instance rather than explicitly calling out each test/step in the GitHub actions this is now all defined by tox. Any updates to tox.ini will be reflected in the remote CI.

While that is a reasonable benefit it has it's downsides as well - namely that in case of failures you need to go to the CI run logs and see what exactly failed and unless you know what to search/grep for it's painful.

There are pros and cons with everything. In general adhering to the KISS principle—simplifying logic and ensure consistency—likely out weights the cons. It's worth noting the errors are present in the logs, albeit maybe having to toggle an additional chevron.

hashhar · 2022-07-22T06:11:35Z

.github/workflows/ci.yml

+      - name: Run tox
        run: |
-          pytest -s tests/
+          poetry run tox --parallel


Why is --parallel needed?

@hashhar rather than multiple GitHub jobs this workflow is configured to have only one job and thus from a wall-clock perspective having the relevant tox environments run in parallel is more desirable.

Having single job makes it harder to inspect which env is failing without having to go to the actions page, and check the logs. Any way we can preserve current matrix behaviour?

See #201 (comment). I'm really not certain whether the hypothesized concerns are problematic in reality. There's very little friction in introspecting the GitHub workflow logs to determine why CI is failing.

hashhar · 2022-07-22T06:12:06Z

README.md

+
    engine = create_engine("trino://<username>@<host>:<port>/<catalog>/<schema>?access_token=<jwt_token>")
-  
+


extract no-op changes to separate commit.

@hashhar this is likely a default from my editor. I'll revert it. Adding the prettier pre-commit hook helps to ensure files are formatted in a consistent manner which long term would prevent these types of no-op changes.

hashhar · 2022-07-22T06:13:17Z

README.md


 ```
-$ pytest tests/unit
+$ poetry run tox -e <environment> -- tests/unit


Is there a way to have reasonable default so that when developing locally contributors don't need to think about what env to use. CI will do the exhaustive tests.

@hashhar local development likely wants to run multiple environments, i.e., py39, pre-commit, etc. Personally having used tox for a number of years I have no issue with explicitly invoking the Python specific environment when running the unit/integration tests.

We should reduce friction for contributors (another benefit being that everyone running tox will see same env unless they specify explicitly). Generally people shouldn't need to care what env they are testing against unless the issue is env specific (then they can provide their own choice of env).

I'm not really sure explicitly specifying an environment is friction. It's worth noting that local CI likely requires multiple environments, i.e., a common workflow could be to first run tox -e pre-commit to ensure linting, typing, etc. is correct before running the test via tox -e py39 say.

One could argue having a these environments—a subset of which can be run via the -e argument—actually reduces friction as it obfuscates the actual commands from the user, i.e., rather than the user needing to know they need to run the pytest, pre-commit, etc. commands (along with the relevant flags/arguments) they simply invoke the respective environment through the tox CLI which standardizes testing workflows.

Perhaps adding some examples like poetry run tox -e py39 -- tests/unit could be a reasonable consensus? Contributors usually follow README first, or we can have a sample local testing workflow described like run pre-commit then running tests etc.

hashhar · 2022-07-22T06:14:23Z

README.md

-  rm -rf dist/ &&
-  ./setup.py sdist bdist_wheel &&
-  twine upload dist/* &&
+  poetry publish --build &&


any possibility of clearing old state/artifacts for safety?

@hashhar I'm unsure. There's python-poetry/poetry#1329, however it has been open for some time.

Let's keep the rm -rf dist/ in the steps - I verified that Poetry indeed doesn't clean up old artifacts.

hovaesco · 2022-07-26T09:24:31Z

.pre-commit-config.yaml

          - "types-requests"
+
+  - repo: "https://github.com/python-poetry/poetry"
+    rev: "1.2.0b3"


Let's switch to stable release before merging

@hovaesco the reason for including a pre-release it it exposes dependency groups. It's not clear from python-poetry/poetry#5586 when the 1.2.0 release is scheduled.

Are there any major concerns with using a pre-release in the interim especially if it seems to fulfill our needs?

Pre-releases can always change and impact our current approach. Testing with pre-release makes lots of sense but I would opt to use official release in the project.

hovaesco · 2022-07-26T09:28:55Z

README.md

-$ python3 -m venv .venv
-$ . .venv/bin/activate
-$ pip install .
+$ curl -sSL https://install.python-poetry.org | python3 - --preview


Let's wait for the official release before merging.

See previous comment.

hovaesco · 2022-07-26T09:32:17Z

pyproject.toml

@@ -0,0 +1,66 @@
+[build-system]
+requires = ["poetry-core>=1.1.0b3"]


Should it be 1.2.0b3 as for now?

hovaesco · 2022-07-26T09:33:47Z

pyproject.toml

+
+[tool.poetry]
+name = "trino-python-client"
+version = "0.314.0"


We need some changes in versioning guides in README.md.

@hovaesco could you provide more context about what is required. I merely lifted this from here which doesn't seem to be referenced elsewhere.

Please take a look here

hovaesco · 2022-07-26T09:42:50Z

pyproject.toml

+pytz = "*"
+requests = "*"
+requests_kerberos = { version = "*", optional = true }
+sqlalchemy = { version = "~1.4", optional = true }


It was ~1.3 in setup.py previously, is there any specific reason for that change?

@hovaesco this is precisely the reason for this change, i.e., to ensure deterministic builds. Specifically when this was set to ~1.3 running,

poetry run tox -e py39 -- tests/unit

would result in the following error:

Traceback: ../../../.pyenv/versions/3.9.4/lib/python3.9/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) tests/unit/sqlalchemy/test_dialect.py:5: in <module> from sqlalchemy.engine import make_url E ImportError: cannot import name 'make_url' from 'sqlalchemy.engine'

In SQLAlchemy v1.4 they exposed the make_url function to the to the sqlachemy.engine module as previously in v1.3 one would need to import this via,

from sqlalchemy.engine.url import make_url

i.e., v1.3 is not viable given how the current code is written and thus the lower bound was wrong. This logic was added in da1441f which defined the sqlalechemy~=1.3 requirement (note the inclusion of the = which differs from the Poetry dependency), which—per PEP 440—implies,

For a given release identifier V.N , the compatible release clause is approximately equivalent to the pair of comparison clauses:

>= V.N, == V.*

thus the sqlalechemy~=1.3 requirement is equivalent to sqlachemy>=1.3, == 1.*. If you look at a GitHub workflow CI run (example) you can see,

Collecting sqlalchemy~=1.3 Downloading SQLAlchemy-1.4.39-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)

that the CI is installing the latest version (v1.4.39) which adheres to the requirement specification.

Note in Poetry the tilde requirements (which don't include the =) differ slightly, so ~1.3 is equivalent to >= 1.3.0, < 1.4.0.

I see, thanks for the detailed explanation.

cla-bot · 2022-07-26T16:29:44Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-26T17:10:56Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-26T17:32:13Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-27T16:28:39Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-27T16:36:50Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2022-07-27T16:59:15Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

john-bodley · 2022-07-27T17:10:40Z

Thanks @hashhar and @hovaesco for the review. I’ve hopefully answered/addressed all your comments/questions.

I opted to downgrade the version of Poetry (from a pre-release to the latest stable version)—albeit it being less feature rich—to unblock the PR.

hovaesco · 2022-08-01T12:04:16Z

.github/workflows/ci.yml

-      - name: "Install pre-commit"
-        run: pip install pre-commit
-
-      - name: "Run pre-commit checks"
-        run: pre-commit run --all-files
-


If those steps are removed, then I assume checks jobs is redundant. Outstanding jobs step be moved to build job.

However, it would be nice to still execute pre-commit checks in checks job. Then build could be run only if checks job is successful. It will reduce runners usage.

@hovaesco is it necessary to split out the checks vs the build? I agree that one could argue that pre-commit performs pre-flight checks, however i) it does mean the DRY principle isn't really adhered to, and ii) you could argue (per the name) that these should actually be first run locally (alongside all the other local CI checks) at the time of the commit. This is actually the most efficient use of resources and results in faster iteration.

You cannot control behaviour of people who submit PRs. It's still useful to let people submit PRs in an imperfect shape and guide them to the finish line so that their next contribution is easier.

Also keeping the checks split has two benefits:

Lesser wall-time for each check.

Easy to see at a glance what failed with zero additional clicks (that's also why a matrix was used instead of tox for the version based tests)

I agree with arguments given @hashhar that CI should still have those jobs split out + it will reduce runners usage.

hovaesco

One general comment, could you please squash commits (if possible) and rename commit messages?

cla-bot · 2022-08-31T16:03:37Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from f7bec24 to 7489e74 Compare July 18, 2022 17:39

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from b7ddd49 to 2855fda Compare July 18, 2022 18:14

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 2855fda to 82deefb Compare July 18, 2022 18:16

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 82deefb to 2a376c6 Compare July 18, 2022 20:00

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 2a376c6 to 030fe75 Compare July 19, 2022 06:43

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from c3830b4 to 39f857c Compare July 21, 2022 00:38

john-bodley requested review from ebyhr, hashhar and mdesmet July 22, 2022 05:29

hashhar reviewed Jul 22, 2022

View reviewed changes

hovaesco reviewed Jul 26, 2022

View reviewed changes

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from ac8dd71 to 83938b0 Compare July 26, 2022 16:29

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 83938b0 to 4932a79 Compare July 26, 2022 17:10

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 4932a79 to d2437ef Compare July 26, 2022 17:32

john-bodley requested review from hashhar and hovaesco July 27, 2022 02:42

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from d2437ef to 8cca8e3 Compare July 27, 2022 16:28

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 8cca8e3 to f3e84eb Compare July 27, 2022 16:36

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from f3e84eb to 70cb0b1 Compare July 27, 2022 16:59

hovaesco reviewed Aug 1, 2022

View reviewed changes

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 70cb0b1 to 3d2464c Compare August 31, 2022 15:59

cla-bot bot added the cla-signed label Aug 31, 2022

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 3d2464c to f97d3da Compare August 31, 2022 16:03

cla-bot bot removed the cla-signed label Aug 31, 2022

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from f97d3da to 1226d39 Compare August 31, 2022 16:06

cla-bot bot added the cla-signed label Aug 31, 2022

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 1226d39 to 4545d7c Compare August 31, 2022 16:23

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch 3 times, most recently from f4d114e to 11858f7 Compare October 19, 2022 17:11

[CI] Migrate to pyproject.toml and poetry for deterministic builds

ecc5a3b

john-bodley force-pushed the john-bodley--pyproject-plus-poetry branch from 11858f7 to ecc5a3b Compare October 19, 2022 17:27

ebyhr removed their request for review January 31, 2023 04:16


		engine = create_engine("trino://<username>@<host>:<port>/<catalog>/<schema>?access_token=<jwt_token>")

		@@ -0,0 +1,66 @@
		[build-system]
		requires = ["poetry-core>=1.1.0b3"]

[CI] Migrate to pyproject.toml and use poetry for deterministic builds #201

Are you sure you want to change the base?

[CI] Migrate to pyproject.toml and use poetry for deterministic builds #201

Conversation

john-bodley commented Jul 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 18, 2022

Uh oh!

cla-bot bot commented Jul 19, 2022

Uh oh!

cla-bot bot commented Jul 21, 2022

Uh oh!

cla-bot bot commented Jul 21, 2022

Uh oh!

cla-bot bot commented Jul 21, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

john-bodley Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hashhar Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

john-bodley Jul 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

john-bodley commented Jul 18, 2022 •

edited

Loading

john-bodley Jul 27, 2022 •

edited

Loading

hashhar Jul 27, 2022 •

edited

Loading

john-bodley Jul 26, 2022 •

edited

Loading

john-bodley Jul 26, 2022 •

edited

Loading