Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce subdivisions aliases #1662

Merged
merged 38 commits into from Feb 13, 2024
Merged

Conversation

sphh
Copy link
Contributor

@sphh sphh commented Jan 23, 2024

Proposed change

Adds aliases for subdivision regions. Closes #1660

Type of change

  • New country/market holidays support (thank you!)
  • Supported country/market holidays update (calendar discrepancy fix, localization)
  • Existing code/documentation/test/process quality improvement (best practice, cleanup, refactoring, optimization)
  • Dependency update (version deprecation/upgrade)
  • Bugfix (non-breaking change which fixes an issue)
  • Breaking change (a code change causing existing functionality to break)
  • New feature (new python-holidays functionality in general)

Checklist

  • I've followed the contributing guidelines
  • I've run make pre-commit, it didn't generate any changes
  • I've run make test, all tests passed locally

Updating of the documentation and adding test cases still needs to be done. I just wanted to check, if you are happy with my implementation.

Aliases for subdivisions can be added in the `subdivisions_aliases`
class attribute, a dictionary with the alias as key and the ISO3166-2
subdivision code as value. More than one alias can be added by adding
more entries into the dictionary.

The alias will be resolved only when populating the holidays for the
subregion.
Adds a single character, a common abbreviation, the full German and
English names as aliases for each state.
Adds the patron saint days (and one special other holiday) as bank
holidays for each of the nine Austrian's subdivisions.
@arkid15r
Copy link
Collaborator

Thanks for the PR @sphh!

As I mentioned earlier tests are very important in order to keep the library stable.
First you need to make sure your code doesn't break anything (see the PR checklist).
Then considering significance of the change I'd expect the new functionality to be covered with tests (see the hints I gave at the corresponding issue). Let me know if you need help with that.

@sphh
Copy link
Contributor Author

sphh commented Jan 23, 2024

Yes, I am with you. But as we have already started at the back (API has to be defined and tests should be written first), could you please tell me, if the API in added is something you can live with:

  1. A class level attribute subdivisions_aliases with the alias as key and the official code as value?
  2. Storing the subdivision parameter in self.subdiv as it has been given by the user and if it is an alias only resolving it at the very last moment?

If that is ok with you and your ideas, I can do the tedious tasks of adding tests (if you don't want to do it) and updating the documentation (again if you as assumed native speaker are not better suited to do it).

@arkid15r
Copy link
Collaborator

Yes, I am with you. But as we have already started at the back (API has to be defined and tests should be written first), could you please tell me, if the API in added is something you can live with:

  1. A class level attribute subdivisions_aliases with the alias as key and the official code as value?
  2. Storing the subdivision parameter in self.subdiv as it has been given by the user and if it is an alias only resolving it at the very last moment?

If that is ok with you and your ideas, I can do the tedious tasks of adding tests (if you don't want to do it) and updating the documentation (again if you as assumed native speaker are not better suited to do it).

Yes, those 2 points sound good to me. Looking forward to reviewing the complete version of the code!

Thank you for implementing this functionality!

@sphh
Copy link
Contributor Author

sphh commented Jan 24, 2024

That's great. Thanks for confirming!

I have another question, regarding the table in the section Available countries, which is down to personal liking: Where do you want to have the additional column with the subdivision aliases? Between the ‘Subdivisions’ and ‘Supported Languages’ columns or as last column. Personally I can see it in both places …

@arkid15r
Copy link
Collaborator

That's great. Thanks for confirming!

I have another question, regarding the table in the section Available countries, which is down to personal liking: Where do you want to have the additional column with the subdivision aliases? Between the ‘Subdivisions’ and ‘Supported Languages’ columns or as last column. Personally I can see it in both places …

This one needs more attention. I don't think we need to add a separate column for this. I expect the subdivisions aliases to be a very useful from UX perspective but not widely used feature. Maybe it'd be a good idea to keep it simple and add subdiv alias right to its ISO3166-2 value in brackets, e.g. 1 (B), 2 (K).... I would expect 1 alias per subdiv (maybe 2 at top in really weird cases).

@sphh
Copy link
Contributor Author

sphh commented Jan 24, 2024

Sounds like a sensible plan!

@sphh
Copy link
Contributor Author

sphh commented Jan 25, 2024

The last commits update the README, adds or updates tests and adds one utility function.

The tests flag up two problems:

  1. make pre-commit complains about missing documents. Honestly I have no clue what to do since I have not touched these documents.
  2. make test complains about missing Ukrainian translation. Here I am out of my wits, because I do not speak or read Ukrainian.

I updated the test for subdivisions in table in the README, but there is no test to check, if the given aliases actually exist. Can I leave that to you? I frankly admit, that I am definitely not good at writing tests …

Copy link
Collaborator

@KJhellico KJhellico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphh, this is an impressive contribution! Please look at some suggestions (Ukrainian translation mainly).

holidays/countries/austria.py Outdated Show resolved Hide resolved
holidays/countries/austria.py Outdated Show resolved Hide resolved
holidays/countries/austria.py Outdated Show resolved Hide resolved
holidays/countries/austria.py Outdated Show resolved Hide resolved
holidays/locale/uk/LC_MESSAGES/AT.po Outdated Show resolved Hide resolved
holidays/locale/uk/LC_MESSAGES/AT.po Outdated Show resolved Hide resolved
holidays/locale/uk/LC_MESSAGES/AT.po Outdated Show resolved Hide resolved
holidays/locale/uk/LC_MESSAGES/AT.po Outdated Show resolved Hide resolved
holidays/locale/uk/LC_MESSAGES/AT.po Outdated Show resolved Hide resolved
tests/countries/test_austria.py Outdated Show resolved Hide resolved
sphh and others added 7 commits January 26, 2024 12:23
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com>
@arkid15r
Copy link
Collaborator

arkid15r commented Feb 2, 2024

Mmh, not even the ISO would then rate as an official source: “The International Organization for Standardization (ISO […]) is an independent, non-governmental, international standard development organization” (my emphasis, https://en.wikipedia.org/wiki/International_Organization_for_Standardization).

There are many national standard organisations: DIN for Germany, bsi for UK, ANSI for the US and (ÖNORM](https://en.wikipedia.org/wiki/Austrian_Standards_International) just to name a few. And there is CEN for the EU. See also https://en.wikipedia.org/wiki/Standards_organization.

So based on this reasoning the iso.org is kind of equal to cacert.org? I'm not going to argue with you on this. However, checking with gov related sites has been our standard verification practice. The idea was to find those aliases used on official Austria sites. I've found this example and it's different from the one above.

I don't like the idea of putting unverified abbreviations into the alias list. The long story short: if there is a proposal to add an alias that is different from the subdivision's full name It'd really beneficial to see the source or/and example of official usage.

This is my current understanding of existing mapping which should be added together with the full names (see the code suggestion):

 Burgenland - Bgld.
 Carinthia (Kärnten) - Ktn.
 Lower Austria (Niederösterreich) - NÖ
 Upper Austria (Oberösterreich) - OÖ
 Salzburg - Salzbg. or Sbg.
 Styria (Steiermark) - Stmk.
 Tyrol (Tirol) - T.
 Vorarlberg - Vbg. or Vor.
 Vienna (Wien) - W.

For Austria we already have 3. But that might be a special case because of the unfortunate naming in the ISO 3166-2 standards

The current PR code contains just 1 alias per subdivision with a pending suggestion to change them to their full names.

@sphh
Copy link
Contributor Author

sphh commented Feb 3, 2024

So based on this reasoning the iso.org is kind of equal to cacert.org?

No, cacert.org is just quoting an ÖNORM, which is the national equivalent to the ISO. As is DIN, ANSI, BS or any other national standard.

I've found this example and it's different from the one above.

Yes, it is. But then they define it at the beginning and thus they can use whatever they like…

I don't like the idea of putting unverified abbreviations into the alias list. The long story short: if there is a proposal to add an alias that is different from the subdivision's full name It'd really beneficial to see the source or/and example of official usage.

And here comes the ÖNORM in!

I would say, the aliases should contain the full name, a one (or two-letter) designator without a period at the end and an abbreviation, hence

Burgenland - B - Bgld.
Carinthia (Kärnten) - K - Ktn.
Lower Austria (Niederösterreich) - N - NÖ
Upper Austria (Oberösterreich) - O - OÖ
Salzburg - S - Sbg.
Styria (Steiermark) - S - Stmk.
Tyrol (Tirol) - T
Vorarlberg - V- Vbg.
Vienna (Wien) - W

The English full name can be provided by the translation.

@arkid15r
Copy link
Collaborator

arkid15r commented Feb 5, 2024

And here comes the ÖNORM in!

You keep mentioning this as solution for the problem. Can I see a link to their standard regarding AT state names or whatever you use for your reasoning?

I'm having a hard time to find examples of real use cases for the aliases we've been discussing for a while. Is there anything that can help answer, for example, this question: "why B / Bgld for Burgenland?"

The English full name can be provided by the translation.

I don't think we want to do the aliases localization. We have ISO subdivision codes for everyone.
Do you think it's something we should necessarily implement?

@sphh
Copy link
Contributor Author

sphh commented Feb 6, 2024

And here comes the ÖNORM in!

You keep mentioning this as solution for the problem.

Yes, because the ÖNORM (or DIN or BS or ANSI) are the national versions of the ISO. I reference this, because in the beginning you insisted that everything has to be mentioned in standards. May I ask, why the national standards are not acceptable, but the international is?

Can I see a link to their standard regarding AT state names or whatever you use for your reasoning?

If you want, you can buy their standards. Or you trust cacert.org to cite it correctly. Or see below.

BTW have you tried to search for ‘AT-0’, the official ISO denomination? Did you get any results?

I am still not sure, what you need, so that you can accept an alias?

IMHO aliases are there to make life easier and there are many common aliases, which are in common use but not written down in laws or standards …

I'm having a hard time to find examples of real use cases for the aliases we've been discussing for a while. Is there anything that can help answer, for example, this question: "why B / Bgld for Burgenland?"

Here is my collection of supporting documents:

Would these documents alleviate your reservations?

The English full name can be provided by the translation.

I don't think we want to do the aliases localization. We have ISO subdivision codes for everyone. Do you think it's something we should necessarily implement?

I only mentioned this, because you mentioned the English and the local German versions in one go. Please do not combine these into one alias!

  - use confirmed subdivisoin aliases
  - update README
  - update l10n (comments and translation)
  - update snapshots
  - fix docs tests
Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this PR based on the information provided by @sphh. Thank you, that helped a lot with the subdiv aliases decision.

Let's work a bit more on this and get the change merged! I think we're very close. I'm a bit concerned with test_docs.py tests complexity though -- it needs to be simplified later.

@sphh, @KJhellico please review this code once again when you have a chance.

@sphh
Copy link
Contributor Author

sphh commented Feb 8, 2024

I've updated this PR based on the information provided by @sphh. Thank you, that helped a lot with the subdiv aliases decision.

Thanks!

But why have the single letter aliases (B, K, N, O, S, St and V) been removed? I thought, it was decided, that they are ok …

Let's work a bit more on this and get the change merged! I think we're very close. I'm a bit concerned with test_docs.py tests complexity though -- it needs to be simplified later.

Any reason why you don't take my approach with the single regexp? That makes the Python code simpler admittedly at the cost of a more complex regexp. Personally I find that approach easier to follow (especially if there is a comment, explaining what that regexp does). But I guess it is down to personal preferences (comments explaining the code always helps).

@arkid15r
Copy link
Collaborator

arkid15r commented Feb 8, 2024

But why have the single letter aliases (B, K, N, O, S, St and V) been removed? I thought, it was decided, that they are ok …

I hadn't found any references to those aliases so I removed them from the initial version.

Any reason why you don't take my approach with the single regexp? That makes the Python code simpler admittedly at the cost of a more complex regexp. Personally I find that approach easier to follow (especially if there is a comment, explaining what that regexp does). But I guess it is down to personal preferences (comments explaining the code always helps).

No, no reason. I picked up from the most recent code version. Feel free to change it back.

Please also note I forgot to revert HolidayBase::get_subdivision_aliases to use your implementation.

@sphh
Copy link
Contributor Author

sphh commented Feb 8, 2024

I hadn't found any references to those aliases so I removed them from the initial version.

Would a copy of the official dictionary showing these abbreviations suffice? If yes, I'll create it.

No, no reason. I picked up from the most recent code version. Feel free to change it back.

I guess, you will find your commit, which changed it, faster than I do. Frankly, I already spent much too much time on all this for such a simple PR …

Please also note I forgot to revert HolidayBase::get_subdivision_aliases to use your implementation.

Very good, I noticed that (but has that been changed some time ago?)

@arkid15r
Copy link
Collaborator

arkid15r commented Feb 8, 2024

Would a copy of the official dictionary showing these abbreviations suffice? If yes, I'll create it.

I created the list based on the materials you've provided as well as my own research results. Yes, the dictionary was one of the sources I used. My point is still the same -- if an alias is in a wide use in AT I'd like to have an evidence of that in any type of (semi-) official documents, gov sites, other sources. I have a broader question: why 2 aliases (full name + short name) isn't enough?

Another question is that the list already contains two one-letter aliases -- how to deal with those cases? I'd like to have consistency at this level too. Ideally It'd be great to see a final state of the alias list from your perspective first.

I guess, you will find your commit, which changed it, faster than I do. Frankly, I already spent much too much time on all this for such a simple PR …

The current code looks okay to me.

Very good, I noticed that (but has that been changed some time ago?)

Not sure what you mean. I see that the current implementation differs from your original code.

@sphh
Copy link
Contributor Author

sphh commented Feb 8, 2024

I created the list based on the materials you've provided as well as my own research results. Yes, the dictionary was one of the sources I used. My point is still the same -- if an alias is in a wide use in AT I'd like to have an evidence of that in any type of (semi-) official documents, gov sites, other sources. I have a broader question: why 2 aliases (full name + short name) isn't enough?

Because all aliases are in use: The full name (obviously), the abbreviations and the very short single letter abbreviation. For two subdivisions the abbreviation and the single letter are the same. That happens. You could also question, why NÖ and OÖ have all capital letters and the other abbreviations are upper- and lowercase. I am afraid, that's how living languages work (sadly no logic there at all …).

Another question is that the list already contains two one-letter aliases -- how to deal with those cases? I'd like to have consistency at this level too. Ideally It'd be great to see a final state of the alias list from your perspective first.

These are the aliases which were initially included in my pull request:
AT-1: Burgenland, Bgld, B
AT-2: Kärnten, Ktn, K
AT-3: Niederösterreich, NÖ, N
AT-4: Oberösterreich, OÖ, O
AT-5: Salzburg, Sbg, S
AT-6: Steiermark, Stmk, St
AT-7: Tirol, T
AT-8: Vorarlberg, Vbg, V
AT-9: Wien, W

If that's ok, I prepare the supporting document showing from the official dictionary showing the single (and that single twin) letter abbreviations.

@sphh
Copy link
Contributor Author

sphh commented Feb 9, 2024

This is the supporting document for the single letter abbreviations:
single_letter_abbreviations.pdf

  - update AT alias list
  - update HolidayBase::get_subdivision_aliases output
  - update README.rst
@arkid15r
Copy link
Collaborator

This is the supporting document for the single letter abbreviations: single_letter_abbreviations.pdf

Thanks for compiling this! I ended up with the following with my own attempt to figure this out:

The two-letter abbreviations I provided earlier (e.g., Bg, K, NÖ, OÖ, S, Stmk, V, W) are more commonly used and widely recognized in Austria than single-letter abbreviations. They are typically used in postal addresses, official documents, and other written communication for brevity and clarity.

While single-letter abbreviations (e.g., B, K, N, O, S, St, V, W) might also be used informally or in specific contexts, they are less common compared to the two-letter abbreviations and may not be as universally understood.

In summary, the two-letter abbreviations are more frequently used and standardized for representing Austrian federal states in various forms of communication.

I still think 2 aliases would be sufficient for majority (maybe all) of cases. However, one of our lower priority goals is better user experience. So I believe this may be actually beneficial (according to at least one user from the target audience -- @sphh).

I hope this is the final list of aliases everybody agrees on.

I also updated README.rst and HolidayBase::get_subdivision_aliases code.

PTAL!

@sphh
Copy link
Contributor Author

sphh commented Feb 12, 2024

I ended up with the following with my own attempt to figure this out:

The two-letter abbreviations I provided earlier (e.g., Bg, K, NÖ, OÖ, S, Stmk, V, W) are more commonly used and widely recognized in Austria than single-letter abbreviations. They are typically used in postal addresses, official documents, and other written communication for brevity and clarity.
While single-letter abbreviations (e.g., B, K, N, O, S, St, V, W) might also be used informally or in specific contexts, they are less common compared to the two-letter abbreviations and may not be as universally understood.
In summary, the two-letter abbreviations are more frequently used and standardized for representing Austrian federal states in various forms of communication.

I am afraid, that is not my perception how they are used here in Austria. It might be confusing, that there are more than one abbreviations for one single state, but do not worry, all of these abbreviations are used in one way or the other in official and informal documents. Thanks for including them all! I really trust this is beneficial – at least for all Austrians.

I hope this is the final list of aliases everybody agrees on.

I am more than happy with that list of aliases from the commit 6e85ec6 (Why wouldn't I, because these are exactly the aliases from my first commit…) Thanks.

@arkid15r
Copy link
Collaborator

I am more than happy with that list of aliases from the commit 6e85ec6 (Why wouldn't I, because these are exactly the aliases from my first commit…) Thanks.

Not exactly the same but very close to it. Thanks for working with me on this @sphh 👍

@KJhellico could you review it when you get a chance?

KJhellico
KJhellico previously approved these changes Feb 13, 2024
Copy link
Collaborator

@KJhellico KJhellico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one suggestion, for l10n comments consistency.

holidays/countries/austria.py Outdated Show resolved Hide resolved
Copy link

sonarcloud bot commented Feb 13, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots

See analysis details on SonarCloud

@arkid15r arkid15r added this pull request to the merge queue Feb 13, 2024
Merged via the queue into vacanza:beta with commit 9f470d9 Feb 13, 2024
25 of 26 checks passed
@arkid15r arkid15r mentioned this pull request Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement subdivision aliases support
4 participants