Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The output char buffer is too small to contain the decoded characters, encoding codepage '65001' #62

Open
dbampersand opened this issue Jul 26, 2024 · 1 comment

Comments

@dbampersand
Copy link
Contributor

dbampersand commented Jul 26, 2024

When loading a file with an entity lump that has characters with a codepoint greater than 127, the file will fail to parse correctly, outputting: "The output char buffer is too small to contain the decoded characters, encoding codepage '65001'"

This is because entities aren't actually compiled to UTF-8 but instead to 8-bit ASCII, for example if you put Ě (0xC49A) in an entity key it will compile to just E, but if you put a character that exists in extended ASCII like ÿ (0xFF) it won't get stripped.

See these three TF2 jump maps for examples of this: https://filebin.net/rbh7fcpz4zmdiqo6

For instance, on jump_4starters_b1_fix.bsp it crashes on the ç character (E7): https://i.imgur.com/QT5Xr2R.png

@tsa96
Copy link
Member

tsa96 commented Jul 26, 2024

Huh, good spot. If this is being used everywhere in BSP (besides just ent lump) we perhaps should just make this encoding a static property of BspFile and use it everywhere. Will have a poke around in engine at some point to see if there's explicit mention of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants