Skip to content

Conversation

@sarahlwelton
Copy link
Contributor

Remove Latin.yml (Google Version).

Create new Latin.yml (Couchbase Version).

Add style guide page to explain.

Remove Latin abbreviation flagging from Terminology.yml to replace with Latin.yml.

…Couchbase Version). Add style guide page to explain. Remove Latin abbreviation flagging from Terminology.yml to replace with Latin.yml.
action:
name: replace
swap:
'(?i)\b(?:e\s*\.?\s*g\s*\.?)': for example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will that also catch e.g. "egg" (because there's no \b after the 'g'?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these regexes are quite hard to read.
Could we perhaps:

  • skip the \s* before the dots, they're not really correct anyway
  • skip the non-capturing ?:
  • make the options explicit
  • don't need to worry about the final ., as that'll trigger \b anyway
  • consider more explicit alternatives, like \b(eg|e[.] ?g)\b where you at least see eg together at one point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being a little lazy and just used Mr. Gippity for this. Try the cleaned up versions - "egg" will no longer be flagged.

@sarahlwelton sarahlwelton requested a review from osfameron October 1, 2025 15:35
swap:
'(?i)\be*\.?g\.?\b': for example
'(?i)\bi\.?e\.?\b': specifically
'(?i)\betc\.?|etc\.?|e\.t\.c\b': and so on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you have alternatives | then you'll need to group them in parens, otherwise you'll get the \b only associated with the first and last alternative

action:
name: replace
swap:
'(?i)\be*\.?g\.?\b': for example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e* isn't what we want, it will match the empty string '' and 'eeee' etc.

I think this is correct now 😓

* parens needed to group alternatives
* `scope: raw` needed to process "i. e" (otherwise dot-space combo starts a new sentence scope)
* if we didn't care about the dot-space usage, we might be able to simplify by removing `nonword: true`?
* I didn't bother checking for multiple spaces, because the kind of person who writes "i.   e ."
isn't going to be controlled by mere Vale rules.

This is the kind of nitty gritty stuff that the test cases would be useful to debug!
I'll get my PR updated and merged in after you merge this one 👍
@osfameron osfameron self-requested a review October 2, 2025 12:10
Copy link
Collaborator

@osfameron osfameron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Don't worry about the failing test - that's because I had to add the workflow to main in order to test it in branch.
Once you're merged in, I'll update my PR to fix all tests, and merge it finally.

@sarahlwelton sarahlwelton merged commit d12fb6d into main Oct 2, 2025
1 of 2 checks passed
@sarahlwelton sarahlwelton deleted the DOC-13594-latin-updates branch October 2, 2025 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants