Skip to content

Conversation

@eyalleshem
Copy link
Contributor

@eyalleshem eyalleshem commented Oct 27, 2025

This continues the preparation for a borrowed Tokenizer (#2036) and adds more internal functions that borrow strings during tokenization.
This commit also handles places where the Tokenizer could modify the original strings. In such cases, the strategy is to create functions that return Cow<'a, str> and create an owned version of the string when modification is needed.
This commit is rebased on PR #2073 and will remain in draft until #2073 is reviewed.

eyalsatori and others added 2 commits October 23, 2025 20:31
Key points for this commit:
- The peekable trait isn't sufficient for using string slices, as we need
  the byte indexes (start/end) to create string slices, so added the current
  byte position to the State struct
  (Note: in the long term we could potentially remove peekable and use only
  the current position as an iterator)
- Created internal functions that create slices from the original query
  instead of allocating strings, then converted these functions to return
  String to maintain compatibility (the idea is to make a small, reviewable
  commit without changing the Token struct or the parser)
  Add internal _borrowed() functions that return Cow<\'a, str> to prepare for
  zero-copy tokenization. When the source string needs no transformation
  (no escaping), return Cow::Borrowed. When transformation is required,
  return Cow::Owned.

  The Token enum still uses String, so borrowed values are converted via
  to_owned() for now. This maintains API compatibility while preparing the
  codebase for a future refactor where Token can hold borrowed strings.

  Optimized: comments, quoted strings, dollar-quoted strings, quoted identifiers.
@eyalleshem eyalleshem force-pushed the reduce-string-copies-cow branch from fa0db77 to d92e39a Compare October 27, 2025 10:15
@alamb
Copy link
Contributor

alamb commented Oct 29, 2025

FYI @iffyio and @yoavcloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants