Skip to content

Korean Ordinal TN support#286

Merged
tbartley94 merged 14 commits intoNVIDIA:ko_tn_staging_v1from
bbae0312:koordinals
Jul 14, 2025
Merged

Korean Ordinal TN support#286
tbartley94 merged 14 commits intoNVIDIA:ko_tn_staging_v1from
bbae0312:koordinals

Conversation

@bbae0312
Copy link

What does this PR do ?

This PR adds support for Korean ordinal number text normalization.

Included:

  • Ordinal tagger FST (ordinal.py)
  • Verbalizer logic (verbalize.py)
  • Unit tests:
    • Pytest: test_ordinal.py, test_cases_ordinal.txt
    • Sparrowhawk: test_sparrowhawk_normalization.sh
  • Updated tokenizer and classifier FSTs

Before your PR is "Ready for review"

Pre checks:

  • [ X] Have you signed your commits? Use git commit -s to sign.
  • [ X] Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • [ X] If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • [ X] Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • [ X] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • [ X] Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • [ X] If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • [ X] Remove import guards (try import: ... except: ...) if not already done.
  • [ X] If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • [ X] Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • [ X] New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

.gitignore Outdated
.hydra/
nemo_experiments/
*.swp
*.far
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not edit these files unless strictly necessary

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it — I’ve removed the *.far line.

@mgrafu mgrafu changed the base branch from main to ko_tn_staging_v1 June 11, 2025 18:20
8 여덟
9 아홉
10 열
11 열한
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for numbers that build from existing ones, let's use rules instead (it seems that 12 == 10>2, for example, and this is repeated up to 39)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if there is overlap between these characters and cardinal, it is important to leverage one class to develop the other


graph_ordinal_1to39 = pynini.string_file(get_abs_path("data/ordinal/digit_1to39.tsv")) + pynini.accep("번째")

graph_cardinal = cardinal.just_cardinals + pynini.accep("번째")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great use of the cardinal graph! let's rename the variable, since it doesn't just represent cardinals

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to graph_ordinal_from40!

graph_cardinal = cardinal.just_cardinals + pynini.accep("번째")

graph_ordinal = (
pynutil.add_weight(graph_ordinal_1to39, 0.1) | pynutil.add_weight(graph_cardinal, 1.0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's pick a single way to normalize numbers 1 through 39 and add an exception instead, keeping the weights untouched

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if one of the two is correct, pick that one. otherwise, pick whichever is most common

overwrite_cache: set to True to overwrite .far files
"""

def __init__(self, cache_dir: str = None, overwrite_cache: bool = False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need or use this graph elsewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — I'll go ahead and remove it

@@ -0,0 +1,20 @@
1번째~첫번째
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some test cases for your different 1 through 39 graph?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add some test cases for 1 through 39 graph!

bbae0312 and others added 4 commits June 11, 2025 14:22
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
…feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
@bbae0312 bbae0312 marked this pull request as ready for review June 11, 2025 23:04
.gitignore Outdated
.hydra/
nemo_experiments/
*.swp
*.swp No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure that this file doesn't show up in the PR at all?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will delete this file in the PR!

@bbae0312 bbae0312 force-pushed the koordinals branch 2 times, most recently from 5313873 to a49a969 Compare June 16, 2025 18:45
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
…SV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
@github-actions
Copy link

github-actions bot commented Jul 2, 2025

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 2, 2025
@tbartley94 tbartley94 self-requested a review July 3, 2025 15:51
Copy link
Member

@tbartley94 tbartley94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add additional text files, remove the tars and revert gitignore

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
@github-actions github-actions bot removed the Stale label Jul 4, 2025
@tbartley94
Copy link
Member

@bbae0312 please refrain from force pushes

Copy link
Member

@tbartley94 tbartley94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Have all tests passed?

7 칠십
8 팔십
9 구십
9 구십 No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Did you change encoding?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t change the encoding, but not sure why it’s being detected like that. I’ll double-check just in case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and all tests passed!

Copy link
Member

@tbartley94 tbartley94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tbartley94 tbartley94 merged commit aeaa781 into NVIDIA:ko_tn_staging_v1 Jul 14, 2025
2 of 3 checks passed
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Feb 27, 2026
* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
tbartley94 pushed a commit that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing (#285)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add __init__.py to ko/data directory

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update KO_TN_CACHE to trigger Korean CI run

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean Ordinal TN support (#286)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN Decimal Support (#303)

* feat(ko/decimal): add Korean decimal TN support

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(ko): Add fraction tagger and verbalizer with tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko): Update decimal and fraction taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Date and Time (#316)

* feat(ko/date): Add date TN taggers, verbalizers, test cases, and post-processing fixes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/date): update date tagger and sparrowhawk test

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date TN fixes & cleanup

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Add Time tagger/verbalizer + tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date — strict YYYY for delimited formats; define single-year 1–4 digit behavior

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Money and Telephone (#324)

* feat(ko/money): Korean Money TN only; add data & tests; wire tagger/verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/money): polish tagger/verbalizer & expand tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: add Telephone TN (tagger+verbalizer) + wire + tests; include money/test updates

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: refactor money/telephone taggers & verbalizers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko/money: use NEMO_NOT_QUOTE, lowercase space helper, trim mid optimizes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: update money/telephone taggers and telephone verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* ko: update telephone taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Measure and Electronic (#353)

* Add: Korean Measure & Electronic TN (taggers, verbalizers, tests, data)

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update KO electronic & measure taggers/verbalizers and test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Edited as per review feedback

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN fixes: cardinal, decimal, fraction, date

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add ko electronic extensions and improve electronic/telephone normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN issues and update test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN electronic and post-processing issues

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN spacing and electronic/cardinal handling

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix optional token separator and remove redundant whitespace normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Remove unused KO post_processing and update exporter

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add native counting support for number+counter in Korean TN

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing (NVIDIA#285)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add __init__.py to ko/data directory

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update KO_TN_CACHE to trigger Korean CI run

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean Ordinal TN support (NVIDIA#286)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN Decimal Support (NVIDIA#303)

* feat(ko/decimal): add Korean decimal TN support

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(ko): Add fraction tagger and verbalizer with tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko): Update decimal and fraction taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Date and Time (NVIDIA#316)

* feat(ko/date): Add date TN taggers, verbalizers, test cases, and post-processing fixes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/date): update date tagger and sparrowhawk test

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date TN fixes & cleanup

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Add Time tagger/verbalizer + tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date — strict YYYY for delimited formats; define single-year 1–4 digit behavior

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Money and Telephone (NVIDIA#324)

* feat(ko/money): Korean Money TN only; add data & tests; wire tagger/verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/money): polish tagger/verbalizer & expand tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: add Telephone TN (tagger+verbalizer) + wire + tests; include money/test updates

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: refactor money/telephone taggers & verbalizers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko/money: use NEMO_NOT_QUOTE, lowercase space helper, trim mid optimizes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: update money/telephone taggers and telephone verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* ko: update telephone taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Measure and Electronic (NVIDIA#353)

* Add: Korean Measure & Electronic TN (taggers, verbalizers, tests, data)

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update KO electronic & measure taggers/verbalizers and test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Edited as per review feedback

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN fixes: cardinal, decimal, fraction, date

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add ko electronic extensions and improve electronic/telephone normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN issues and update test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN electronic and post-processing issues

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN spacing and electronic/cardinal handling

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix optional token separator and remove redundant whitespace normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Remove unused KO post_processing and update exporter

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add native counting support for number+counter in Korean TN

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing (NVIDIA#285)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add __init__.py to ko/data directory

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update KO_TN_CACHE to trigger Korean CI run

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean Ordinal TN support (NVIDIA#286)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN Decimal Support (NVIDIA#303)

* feat(ko/decimal): add Korean decimal TN support

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(ko): Add fraction tagger and verbalizer with tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko): Update decimal and fraction taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Date and Time (NVIDIA#316)

* feat(ko/date): Add date TN taggers, verbalizers, test cases, and post-processing fixes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/date): update date tagger and sparrowhawk test

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date TN fixes & cleanup

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Add Time tagger/verbalizer + tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date — strict YYYY for delimited formats; define single-year 1–4 digit behavior

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Money and Telephone (NVIDIA#324)

* feat(ko/money): Korean Money TN only; add data & tests; wire tagger/verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/money): polish tagger/verbalizer & expand tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: add Telephone TN (tagger+verbalizer) + wire + tests; include money/test updates

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: refactor money/telephone taggers & verbalizers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko/money: use NEMO_NOT_QUOTE, lowercase space helper, trim mid optimizes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: update money/telephone taggers and telephone verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* ko: update telephone taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Measure and Electronic (NVIDIA#353)

* Add: Korean Measure & Electronic TN (taggers, verbalizers, tests, data)

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update KO electronic & measure taggers/verbalizers and test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Edited as per review feedback

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN fixes: cardinal, decimal, fraction, date

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add ko electronic extensions and improve electronic/telephone normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN issues and update test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN electronic and post-processing issues

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN spacing and electronic/cardinal handling

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix optional token separator and remove redundant whitespace normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Remove unused KO post_processing and update exporter

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add native counting support for number+counter in Korean TN

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 4, 2026
* Add Korean TN support for cardinal numbers and postprocessing (NVIDIA#285)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add __init__.py to ko/data directory

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update KO_TN_CACHE to trigger Korean CI run

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean Ordinal TN support (NVIDIA#286)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN Decimal Support (NVIDIA#303)

* feat(ko/decimal): add Korean decimal TN support

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(ko): Add fraction tagger and verbalizer with tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko): Update decimal and fraction taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Date and Time (NVIDIA#316)

* feat(ko/date): Add date TN taggers, verbalizers, test cases, and post-processing fixes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/date): update date tagger and sparrowhawk test

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date TN fixes & cleanup

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Add Time tagger/verbalizer + tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date — strict YYYY for delimited formats; define single-year 1–4 digit behavior

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Money and Telephone (NVIDIA#324)

* feat(ko/money): Korean Money TN only; add data & tests; wire tagger/verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/money): polish tagger/verbalizer & expand tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: add Telephone TN (tagger+verbalizer) + wire + tests; include money/test updates

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: refactor money/telephone taggers & verbalizers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko/money: use NEMO_NOT_QUOTE, lowercase space helper, trim mid optimizes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: update money/telephone taggers and telephone verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* ko: update telephone taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Measure and Electronic (NVIDIA#353)

* Add: Korean Measure & Electronic TN (taggers, verbalizers, tests, data)

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update KO electronic & measure taggers/verbalizers and test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Edited as per review feedback

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN fixes: cardinal, decimal, fraction, date

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add ko electronic extensions and improve electronic/telephone normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN issues and update test cases

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN electronic and post-processing issues

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN spacing and electronic/cardinal handling

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix optional token separator and remove redundant whitespace normalization

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Remove unused KO post_processing and update exporter

Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add native counting support for number+counter in Korean TN

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Signed-off-by: Jinwoo Bae <34386414+bbae0312@users.noreply.github.com>
Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
bbae0312 added a commit to bbae0312/NeMo-text-processing that referenced this pull request Mar 5, 2026
* Add Korean TN support for cardinal numbers and postprocessing (NVIDIA#285)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add __init__.py to ko/data directory

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update KO_TN_CACHE to trigger Korean CI run

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean Ordinal TN support (NVIDIA#286)

* Add Korean TN support for cardinal numbers and postprocessing

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor Korean TN cardinal and postprocessing logic based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add Korean Ordinal TN logic and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Refactor ordinal logic (1-39, 40+) and add word tagger and verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add support for 0 in ordinal tagger

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Update ordinal.py to exclude digit 1 in code and remove unnecessary TSV file

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove .far files

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/ordinal): update ordinal FST based on review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN Decimal Support (NVIDIA#303)

* feat(ko/decimal): add Korean decimal TN support

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(ko): Add fraction tagger and verbalizer with tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko): Update decimal and fraction taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Date and Time (NVIDIA#316)

* feat(ko/date): Add date TN taggers, verbalizers, test cases, and post-processing fixes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/date): update date tagger and sparrowhawk test

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date TN fixes & cleanup

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Add Time tagger/verbalizer + tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko(TN): Date — strict YYYY for delimited formats; define single-year 1–4 digit behavior

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Money and Telephone (NVIDIA#324)

* feat(ko/money): Korean Money TN only; add data & tests; wire tagger/verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(ko/money): polish tagger/verbalizer & expand tests

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: add Telephone TN (tagger+verbalizer) + wire + tests; include money/test updates

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: refactor money/telephone taggers & verbalizers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko/money: use NEMO_NOT_QUOTE, lowercase space helper, trim mid optimizes

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ko: update money/telephone taggers and telephone verbalizer

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* ko: update telephone taggers

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN for Measure and Electronic (NVIDIA#353)

* Add: Korean Measure & Electronic TN (taggers, verbalizers, tests, data)

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update KO electronic & measure taggers/verbalizers and test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Edited as per review feedback

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Korean TN fixes: cardinal, decimal, fraction, date

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add ko electronic extensions and improve electronic/telephone normalization

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN issues and update test cases

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN electronic and post-processing issues

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix Korean TN spacing and electronic/cardinal handling

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Fix optional token separator and remove redundant whitespace normalization

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Remove unused KO post_processing and update exporter

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* Add native counting support for number+counter in Korean TN

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jinwoo Bae <bbae7050@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants