Change in Polish encodings of udhr corpus reader #3038

hamiltonianflow · 2022-08-25T00:49:11Z

Modification in UdhrCorpusReader corpus reader class, which changes the encoding used for Polish language files Polish-Latin2 and Polish_Polski-Latin2 from ISO-8859-2 to cp1250, so that they are properly read.

Fixes #3037

The change will make the link look more like a link and also ensure that it's visible in both: dark and light mode

…ithub.com/hamiltonianflow/nltk into hamiltonianflow-fix/polish-encoding-in-udhr-reader-change

tomaarsen · 2022-09-01T08:27:30Z

Thank you for finding this! Your fix is simple and proper too. I've added a simple test in 0526ad5 to verify that this will continue to work in the future (and I bunged up the git history of this PR somewhat... Apologies)

SaudKadiri and others added 5 commits August 23, 2022 21:51

Replacing black with a more generic color (blue)

be18a4c

The change will make the link look more like a link and also ensure that it's visible in both: dark and light mode

Change in Polish encodings of udhr corpus reader

1b1fd38

Also set Foreground color to blue of Downloader links

0e2d611

Merge branch 'fix/polish-encoding-in-udhr-reader-change' of https://g…

cfd0c5e

…ithub.com/hamiltonianflow/nltk into hamiltonianflow-fix/polish-encoding-in-udhr-reader-change

Add test re. Polish UDHR encoding

0526ad5

tomaarsen merged commit 13cea29 into nltk:develop Sep 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change in Polish encodings of udhr corpus reader #3038

Change in Polish encodings of udhr corpus reader #3038

hamiltonianflow commented Aug 25, 2022

tomaarsen commented Sep 1, 2022

Change in Polish encodings of udhr corpus reader #3038

Change in Polish encodings of udhr corpus reader #3038

Conversation

hamiltonianflow commented Aug 25, 2022

tomaarsen commented Sep 1, 2022