kernel/linux.git/fs/unicode/utf8data.c_shipped, branch v6.18.21

Revert "unicode: Don't special case ignorable code points"

2024-12-11T22:11:23+00:00

This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi Requested-by: Jaegeuk Kim Signed-off-by: Linus Torvalds

Merge tag 'unicode-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode

2024-11-23T04:50:55+00:00

Pull unicode updates from Gabriel Krisman Bertazi: - constify a read-only struct (Thomas Weißschuh) - fix the error path of unicode_load, avoiding a possible kernel oops if it fails to find the unicode module (André Almeida) - documentation fix, updating a filename in the README (Gan Jie) - add the link of my tree to MAINTAINERS (André Almeida) * tag 'unicode-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode: MAINTAINERS: Add Unicode tree unicode: change the reference of database file unicode: Fix utf8_load() error path unicode: constify utf8 data table

unicode: Don't special case ignorable code points

2024-10-09T17:34:01+00:00

We don't need to handle them separately. Instead, just let them decompose/casefold to themselves. Signed-off-by: Gabriel Krisman Bertazi

unicode: constify utf8 data table

2024-08-13T19:21:50+00:00

All users already handle the table as const data. Move the table itself into .rodata to guard against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh Reviewed-by: Christoph Hellwig Link: https://lore.kernel.org/r/20240809-unicode-const-v1-1-69968a258092@weissschuh.net Signed-off-by: Gabriel Krisman Bertazi

unicode: add MODULE_DESCRIPTION() macros

2024-06-20T23:30:02+00:00

Currently 'make W=1' reports: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/unicode/utf8data.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/unicode/utf8-selftest.o Add a MODULE_DESCRIPTION() to utf8-selftest.c and utf8data.c_shipped, and update mkutf8data.c to add a MODULE_DESCRIPTION() to any future generated utf8data file. Signed-off-by: Jeff Johnson Link: https://lore.kernel.org/r/20240524-md-unicode-v1-1-e2727ce8574d@quicinc.com Signed-off-by: Gabriel Krisman Bertazi

unicode: Add utf8-data module

2021-10-12T14:41:39+00:00

utf8data.h contains a large database table which is an auto-generated decodification trie for the unicode normalization functions. Allow building it into a separate module. Based on a patch from Shreeya Patel . Signed-off-by: Christoph Hellwig Signed-off-by: Gabriel Krisman Bertazi