-
Make sure that you have
condainstalled. Recommend this article if on Mac, just do through step 2. -
Create and activate a new conda environment, e.g.,
transformers-apiwith python 3.10.
conda create --name transformers-api python=3.12
conda activate transformers-api- Run
which pipandwhich pythonto verify path to make sure that yourpythonandpipbinaries are coming from yourcondavirtual environment. Note that the order in which you install conda vs. pip matters to set virtual env priorities.
Getting Started Locally (Start here if not using conda, just make sure you have the right version of python and pip installed)
-
Install
poetryversion 1.8.5:pip install poetry==1.8.5(or usepipxon link here if you prefer isolated envs and you don't havecondamanaging your env) -
Create and enter the virtual environment:
poetry shell. Note: if you use conda, this step may not be necessary. -
Install the dependencies
poetry install -
In
c3po-model-server/app/core/env_var, create asecrets.envfile and ensure it is on the.gitignore. Add the following for local dev:
MM_TOKEN="<your_preprod_mattermost_token>"-
Launch postgres, pgadmin, and minio via docker-compose
docker-compose up --build. -
Visit
localhost:9001. Login with user:miniouserand password:minioadmin. This is the minio console. -
Visit
localhost:5050. Login with email:user@test.comand password:admin. This is the pgadmin console. See notes below for important details -
Run the app db init script
./scripts/init.sh -
Keeping your docker containers running, start the app in a new terminal (activate your conda env first) with
ENVIRONMENT=local uvicorn app.main:versioned_app --reload. -
Open
localhost:8000/v1/docsand start interacting with swagger! -
Run tests and get coverage with
ENVIRONMENT=local pytest --cov, and get html reports for vs code live server (or any server) withENVIRONMENT=local pytest --cov --cov-report=html:coverage_re -
You can shut down and your db / minio data will persist via docker volumes.
Note: instructions included in tutorial linked here
- Add the package, e.g.,
poetry add transformersorpoetry add transformers --group <group_name>where<group_name>is the dependency group name, e.g.,testordev. - Update the lockfile with
poetry lockorpoetry lock --no-updateif you don't want poetry to try to update other deps within your existing versioning constraints - Install the packages with
poetry install, exclude certain groups if desired via adding--without <group_name>.
poetry update or for a specific package, poetry update transformers
- You will see that
POSTGRES_SERVER=localhostin the above steps, however, make sure that you login with hostnamedbin pgAdmin (under the "connection" tab in server properties). This is because the pgAdmin container is launched in the same docker network as the postgres container, so it uses the service name, whereas launching this app from command line uses port forwarding to localhost. The user, password, and db name will all bepostgres, port5432. - We specificy
ENVIRONMENT=localbecause the test stage needs the default to be its variables - For basic CRUD, you can follow this format:
from .base import CRUDBase
from app.models.item import Item
from app.schemas.item import ItemCreate, ItemUpdate
item = CRUDBase[Item, ItemCreate, ItemUpdate](Item)
- the
env_varsforminioin P1 say secure False, but that is only because the intra-namespace comms between pods get automatically mTLS encrypted via istio, so they keephttp://minio.minio:9000as the URL inside the namespace. -aiohttpis a subdep oflangchain, however, do not use it for handling web connections as there are disputed CVEs in that context (disputed as in not official, but it is possible that the risk exists). See issues here: aio-libs/aiohttp#6772 andhttps://github.com/aio-libs/aiohttp/issues/7208
Usually CVEs can be addressed by easily updating a release, realizing it is a false-positive, or finding a new package. Inside of P1, if there is a fix and the CVE is low-threat, you can request a whitelist and wait for the official version. However, if that does not work, you can request that git be installed in the pipeline pip install runner and use pip install with a specific commit addressing the patch. For example, before 4.30.0 was released, this transformers CVE could be patched via
pip install git+https://github.com/huggingface/transformers.git@80ca92470938bbcc348e2d9cf4734c7c25cb1c43#egg=transformers
and adding
transformers @ git+https://github.com/huggingface/transformers.git@80ca92470938bbcc348e2d9cf4734c7c25cb1c43
to the requirements.txt in place of the previous transformers installation.
- Tutorial followed-ish for this repo
- Install conda and tensorflow on Mac M1
pipenvwithconda- Basics of
pipenvfor application dependency management - Conda and pipenv cheat sheet
- How to use pre-commit framework for git hooks
- P1 uses pip for environment setup; locally, both poetry and pip are acceptable
- However, ppg-common broke the pre-commit hook that keeps the poetry and pip requirements in sync
- Process for environment updates:
- Update poetry: $ poetry add package==version
- Sync with pip: $ ./hooks/output-requirements-txt.sh
Logs for the deployed application can be viewed on ArgoCD
- click the application
- view in tree, network, or list modes (see icons next to "Log out" at the top right)
- click the ellipsis to the right of the desired pod
- click Logs
In general, tensorflow and pytorch use the underlying unittest framework that comes stock with Python. However, FastAPI has a ton of great features through pytest that make testing HTTP much, much easier. Good news is that, for the most part, pytest as the runner will also handle unittest, so we can use the TF or pytorch frameworks with unittest and FastAPI with pytest. Some articles on this:
- FastAPI testing
- Tensorflow testing
- Pytest handling unittest
- Mocking in pytest--especially import location
- Better mocking in pytest walkthrough
- Test coverage using
coverage
- Storing Credentials...or just type
git config --global credential.helper store - Create a GPG Key or GPG Commit Signing or GitHub Docs
- Deleting Volumes
- Setting up pgAdmin in Docker
- Setting up postgreSQL for FastAPI in docker
- Full FastAPI / postgres / docker tutorial