- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3
[EWT-1250] Sqlalchemy/Superset expectations from python-DBAPI #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
        
          
                wherobots/db/connection.py
              
                Outdated
          
        
      | schema = reader.schema | ||
| columns = schema.names | ||
| column_types = [field.type for field in schema] | ||
| rows = reader.read_all().to_pandas().values.tolist() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need both read_all().to_pandas() or can you read_pandas() directly? (to be fair according to the docs it does the same thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Using read_pandas() now for better readability.
        
          
                wherobots/db/driver.py
              
                Outdated
          
        
      | }, | ||
| headers=headers, | ||
| if ws_url: | ||
| session_uri = ws_url | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of things here:
- This can be done earlier, so if we feel confident in using the ws_urlwe can do that right after checking the token/api-keys.
- This needs a little bit different logging, since we're not requesting a new runtime (see line 70).
- This should early return so that we don't have to indent everything afterwards. Makes it more readable.
| results_format: Union[ResultsFormat, None] = None, | ||
| data_compression: Union[DataCompression, None] = None, | ||
| geometry_representation: Union[GeometryRepresentation, None] = None, | ||
| ws_url: str = None, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required, instead of calling connect_direct() directly?
The reason this parameter wasn't included in connect() is that it creates ambiguity between the rest of the parameters (like runtime/region) and the runtime you'd actually connect to when providing a ws_url, which may not match those choices.
| query.handler(json.loads(result_bytes.decode("utf-8"))) | ||
| data = json.loads(result_bytes.decode("utf-8")) | ||
| columns = data["columns"] | ||
| column_types = data.get("column_types") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is column_types optional? If so, then it's good that you're using data.get() here, but then in Cursor.__get_results you expect column_types to be non-None. You either need to ensure column_types is always provided, or change __get_results to be more defensive.
| True, # null_ok; Assuming all columns can accept NULL values | ||
| ) | ||
| for col_name in result.columns | ||
| for i, col_name in enumerate(columns) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use https://docs.python.org/3/library/functions.html#zip to avoid jumping hoops with an index (it's much nicer to read, and also more efficient):
self.__description = [
  (
    col_name,
    _TYPE_MAP.get(col_type, 'STRING'),
    ...
  )
  for (col_name, col_type) in zip(columns, column_types)
]
This PR introduces the following requirements that have risen from the Wherobots x Superset integration -
Rowobject.rollback()andcommit()to be implemented. Other OLAP databases such as pyhive simple "pass" the not implementedrollback()andcommit()methods. For context - Superset's background processes often bypass the SQLAlchemy dialect and directly interacts with DBAPI. This is why overriding the rollback and commit methods in the Dialect doesn't suffice.ws_url, toconnection. This helps maintain static connection pool configuration in Superset.