Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isTraditional类似函数判断繁体时存在错误 #46

Open
SoftBlackSheep opened this issue Jan 5, 2024 · 0 comments
Open

isTraditional类似函数判断繁体时存在错误 #46

SoftBlackSheep opened this issue Jan 5, 2024 · 0 comments

Comments

@SoftBlackSheep
Copy link

字典文件加载时,按字符切分之后都加载进了 tSet,但是字典里面不是每个繁体词语,构成的字符都是繁体字,导致加载进了很多简体字符,使用时候会出现误判

public Set<String> tChars() {
        //DLC-保证只初始化一次
        if(CollectionUtil.isNotEmpty(tSet)) {
            return tSet;
        }

        if(CollectionUtil.isEmpty(tSet)) {
            synchronized (tSet) {
                // DLC
                if(CollectionUtil.isEmpty(tSet)) {
                    // 繁体=》简体 词组
                    Map<String, List<String>> tsPhrase = this.tsPhrase();
                    this.addCharToSet(tSet, tsPhrase.keySet());

                    //繁体=》简体 单个字
                    Map<String, List<String>> tsChar = this.tsChar();
                    this.addCharToSet(tSet, tsChar.keySet());

                    //简体=》繁体 词组
                    Map<String, List<String>> stPhrase = this.stPhrase();
                    for(Map.Entry<String, List<String>> entry : stPhrase.entrySet()) {
                        this.addCharToSet(tSet, entry.getValue());
                    }

                    //简体=》繁体 单个字
                    Map<String, List<String>> stChar = this.stChar();
                    for(Map.Entry<String, List<String>> entry : stChar.entrySet()) {
                        this.addCharToSet(tSet, entry.getValue());
                    }

                    // 文本字典
                    List<String> tcLines = StreamUtil.readAllLines("/data/dictionary/tc.txt");
                    for(String line : tcLines) {
                        tSet.addAll(StringUtil.toCharStringList(line));
                    }
                }
            }
        }

        return tSet;
    }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant