Message 163706 - Python tracker

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Content
The problem is that the standard allows some charref to end without a ';', but not all of them.

So both "&Eacuteric" and Éric" will be parsed as "Éric", but only "αcentauri" will result in "αcentauri" -- "&alphacentauri" will be returned unchanged.

I'm now working on #15156 to use this dict in HTMLParser, and detecting the ';'-less entities is not easy.  A possible solution is to keep the names that are accepted without ',' in a separate (private) dict and expose a function like HTMLParser.unescape that implements all the necessary logic.

Regarding ChainMap, the html5 dict should be a superset of the html4 one.