build(deps): bump charset-normalizer from 2.1.1 to 3.0.0 #1331

dependabot · 2022-10-20T22:00:41Z

Bumps charset-normalizer from 2.1.1 to 3.0.0.

Release notes

Sourced from charset-normalizer's releases.

Version 3.0.0

3.0.0 (2022-10-20)

Added

Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results

Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES

Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

normalizer --version now specify if the current version provides extra speedup (meaning mypyc compilation whl)

Changed

Build with static metadata (not pyproject.toml yet)

Make language detection stricter

Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

CLI with opt --normalize fail when using full path for files

TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it

Sphinx warnings when generating the documentation

Removed

Coherence detector no longer returns 'Simple English' instead returns 'English'

Coherence detector no longer returns 'Classical Chinese' instead returns 'Chinese'

Breaking: Method first() and best() from CharsetMatch

UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflicts with ASCII)

Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches

Breaking: Top-level function normalize

Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch

Support for the backport unicodedata2

This is the last version (3.0.x) to support Python 3.6 We plan to drop it for 3.1.x

Version 3.0.0rc1

This is the last pre-release. If everything goes well, I will publish the stable tag.

3.0.0rc1 (2022-10-18)

Added

Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results

Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES

Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

Changed

Build with static metadata using 'build' frontend

Make language detection stricter

Fixed

CLI with opt --normalize fail when using full path for files

TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it

Removed

... (truncated)

Changelog

Sourced from charset-normalizer's changelog.

3.0.0 (2022-10-20)

Added

Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results

Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES

Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

normalizer --version now specify if current version provide extra speedup (meaning mypyc compilation whl)

Changed

Build with static metadata using 'build' frontend

Make the language detection stricter

Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

CLI with opt --normalize fail when using full path for files

TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it

Sphinx warnings when generating the documentation

Removed

Coherence detector no longer return 'Simple English' instead return 'English'

Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'

Breaking: Method first() and best() from CharsetMatch

UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)

Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches

Breaking: Top-level function normalize

Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch

Support for the backport unicodedata2

3.0.0rc1 (2022-10-18)

Added

Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results

Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES

Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

Changed

Build with static metadata using 'build' frontend

Make the language detection stricter

Fixed

CLI with opt --normalize fail when using full path for files

TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it

Removed

Coherence detector no longer return 'Simple English' instead return 'English'

Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'

3.0.0b2 (2022-08-21)

Added

... (truncated)

Upgrade guide

Sourced from charset-normalizer's upgrade guide.

Guide to upgrade your code from v1 to v2

If you are using the legacy detect function, that is it. You have nothing to do.

Detection

Before
from charset_normalizer import CharsetNormalizerMatches
results = CharsetNormalizerMatches.from_bytes(
'我没有埋怨，磋砣的只是一些时间。'.encode('utf_32')
)
After
from charset_normalizer import from_bytes
results = from_bytes(
'我没有埋怨，磋砣的只是一些时间。'.encode('utf_32')
)
Methods that once were staticmethods of the class CharsetNormalizerMatches are now basic functions. from_fp, from_bytes, from_fp and `` are concerned.

Staticmethods scheduled to be removed in version 3.0

Commits

0ec52ef Version 3.0.0 (#223)
db134f3 Update python-publish.yml
690f74c 🔧 pass --no-isolation through CIBW_CONFIG_SETTINGS --build-option
20996c3 ⬆️ cibuildwheel v2.11.1 (fix-tag)
24f366c ⬆️ cibuildwheel v2.11.1
33b7327 🔧 update universal-wheel stage (missing build pkg)
544595d Merge pull request #209 from Ousret/3.0
6367d53 📝 Missing CHANGELOG entry and add language_threshold to docs::advanced...
b15f416 📝 Update CHANGELOG.md
f8e1153 📝 Adjust speedup docs section
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

github-actions · 2022-10-20T22:00:55Z

Looks like a major version upgrade! Skipping auto-merge.

snarfed · 2022-10-26T19:52:01Z

fixed by psf/requests#6261, currently unreleased

Bumps [charset-normalizer](https://github.com/Ousret/charset_normalizer) from 2.1.1 to 3.0.0. - [Release notes](https://github.com/Ousret/charset_normalizer/releases) - [Changelog](https://github.com/Ousret/charset_normalizer/blob/master/CHANGELOG.md) - [Upgrade guide](https://github.com/Ousret/charset_normalizer/blob/master/UPGRADE.md) - [Commits](Ousret/charset_normalizer@2.1.1...3.0.0) --- updated-dependencies: - dependency-name: charset-normalizer dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

github-actions · 2022-10-26T22:14:31Z

Looks like a major version upgrade! Skipping auto-merge.

dependabot · 2022-11-18T22:00:43Z

Superseded by #1350.

dependabot bot added the dependencies Pull requests that update a dependency file label Oct 20, 2022

dependabot bot force-pushed the dependabot/pip/charset-normalizer-3.0.0 branch from 324325a to 2c66d91 Compare October 26, 2022 22:14

dependabot bot closed this Nov 18, 2022

dependabot bot deleted the dependabot/pip/charset-normalizer-3.0.0 branch November 18, 2022 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build(deps): bump charset-normalizer from 2.1.1 to 3.0.0 #1331

build(deps): bump charset-normalizer from 2.1.1 to 3.0.0 #1331

dependabot bot commented on behalf of github Oct 20, 2022 •

edited

github-actions bot commented Oct 20, 2022

snarfed commented Oct 26, 2022 •

edited

github-actions bot commented Oct 26, 2022

dependabot bot commented on behalf of github Nov 18, 2022

build(deps): bump charset-normalizer from 2.1.1 to 3.0.0 #1331

build(deps): bump charset-normalizer from 2.1.1 to 3.0.0 #1331

Conversation

dependabot bot commented on behalf of github Oct 20, 2022 • edited

Version 3.0.0

3.0.0 (2022-10-20)

Added

Changed

Fixed

Removed

Version 3.0.0rc1

3.0.0rc1 (2022-10-18)

Added

Changed

Fixed

Removed

3.0.0 (2022-10-20)

Added

Changed

Fixed

Removed

3.0.0rc1 (2022-10-18)

Added

Changed

Fixed

Removed

3.0.0b2 (2022-08-21)

Added

Guide to upgrade your code from v1 to v2

Detection

Before

After

github-actions bot commented Oct 20, 2022

snarfed commented Oct 26, 2022 • edited

github-actions bot commented Oct 26, 2022

dependabot bot commented on behalf of github Nov 18, 2022

dependabot bot commented on behalf of github Oct 20, 2022 •

edited

snarfed commented Oct 26, 2022 •

edited