Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New PEG-based Parser #555

Open
wants to merge 1,809 commits into
base: dev
Choose a base branch
from
Open

Conversation

tinyAdapter
Copy link
Member

Summary

This PR implements a new WebGAL parser based on parsing expression grammar (PEG). This has certain advantages over the current non-standard string-based parsing and enables the possibility of optimizations, advanced grammar, better error handling, etc. The new parser aims to be fully compatible with the old one.

Features (If Implemented Correctly)

  • 100% backward compatibility
  • More intelligent parsing behaviors
  • Error reporting and recovering from command strings with syntax errors

Backward compatibility

The new parser also generates a sentenceList with all fields available in the old parser. It may add some extra fields (e.g., recording errors and some utility command string information), but the result should work on the current engine.

New Behavior for Syntax Errors

Now, if the parser encounters syntax errors when parsing a command, it will stop at the character that contains the error and skip to the next line, but the parsed command will still be effective

For example, if a script contains

changeFigure:stand.png -left -next;
pixiInit:; // this line has syntax error. `pixiInit` should not have ':'
setAnimation:enter-from-left -target=fig-left -next;

it will then be parsed as

changeFigure:stand.png -left -next;
pixiInit
setAnimation:enter-from-left -target=fig-left -next;

which preserves the behavior of pixiInit.

Error Reporting and Recovery

The new parser supports error reporting and recovery. All errors in the script will be recorded in an errors field, which can be shown to the user after adapting the language server protocol (LSP) on this.

The aforementioned new behaviors on syntax errors ensures error recovery.

Changes to Source Code

The source code of the old parser is moved to packages/parser_legacy. The package name is changed to webgal-parser-legacy

The source code of the new parser is put in packages/parser. It is distributed under MPL-2.0 license (license file attached).

What's Next

  • Minify the generated parser. Currently, the generated PEG parser is ~200KB. For web deployment, this may become an issue. We can minify the compiled parser.js to ~100KB. Together with gzip on the web server, it may reduce the final data transmission.
  • Migrate post-processing logic. To ensure backward compatibility, the raw content of some parsed fields is preserved so that the post-processing still works. This results in unnecessary double parsing. We may need to migrate the post-processing logic to utilize the parsed fields directly.

@MakinoharaShoko, feel free to change the merge base branch :)

MakinoharaShoko and others added 30 commits January 20, 2024 20:57
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.1 to 4.5.2.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v4.5.2/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v4.5.2/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <[email protected]>
…yarn/follow-redirects-1.15.4

build(deps): bump follow-redirects from 1.15.3 to 1.15.4
MakinoharaShoko and others added 26 commits October 15, 2024 20:31
…yarn/body-parser-1.20.3

build(deps): bump body-parser from 1.20.2 to 1.20.3
1.把所有字都向下对齐
2.固定行距保证无论这行有没有拼音都会显示在同样的高度
这里的行是逻辑的行,实际上是可能换行的。另外,也稍微修改了字体大小。
…-say-script

Fix say script to replace spaces with \u00a0
…ored in the state table when loading a save; An issue where performances with ID-based sound effects were not completely cleared after stopping playback; Deduplication of performance lists in the state table and performance controller upon insertion
Process line break: force-multiline & tests
@MakinoharaShoko
Copy link
Member

After review, the following issues exist in the new parser:

  1. Failure to utilize ADD_NEXT_ARG_LIST to add the next parameter for statements that require it automatically.
  2. Incorrect timing for resource preloading. It should occur after parsing.
  3. The SceneParser lacks sufficient type hinting. The parsing results should conform to the IScene type.
  4. The parsing results lack the necessary fields sceneName and sceneUrl.
  5. Failure to call assetsSetter, resulting in script files not being converted to the correct paths as expected.

I have reverted the default exported SceneParser class in index.ts back to the old parser. I believe the following steps can be taken to align the new parser with the previous logic:

  1. Write a longer scene that covers as many statements and syntax variations as possible, and compare the differences in the parsing results between the new and old parsers. For this use case, the goal should be to achieve complete parity between the parsing results of the new and old parsers.
  2. For all test cases, first switch the parser used to the old parser. Then check whether failing test cases are due to issues within the old parser itself, or due to inconsistencies between the expected results of the test cases and the correct parsing results of the old parser.

I believe these temporary imperfections are not difficult to resolve. Our primary goal is to ensure consistency between the parsing results of the new and old parsers. Once this is achieved, we can leverage the enhanced error detection capabilities of the new parser.

@MakinoharaShoko
Copy link
Member

由于先前的 WebGAL 项目不规范,使用 bfg 处理后需要重新同步 dev 分支并 cherry pick 这个 pr 上的提交到从 dev 新拉取的分支。这个 pr 的 commit 很多,很麻烦,所以建议:
1、直接将 parser 相关的目录和可能修改的文件拷贝到一个临时文件夹
2、完全同步 dev 分支到上游
3、从 dev 拉取一个新的分支
4、从临时文件夹拷贝回文件,并进行一次提交

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.