Skip to content

Introduce uppercase-string #3613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Nov 20, 2024
Merged

Conversation

pmjones
Copy link
Contributor

@pmjones pmjones commented Nov 8, 2024

This PR introduces uppercase-string in all places where lowercase-string is recognized, except for SprintfFunctionDynamicReturnTypeExtension.

The omission is because sprintf() format strings cannot be typed reliably as uppercase-string because most of the formatting specifiers are lowercase. For example, the format string 'FOO %s BAR' is treated as a plain string rather than uppercase-string, because %s is lowercase (even though it might represent an uppercase string).

It appears that lowercase-string exhibits similar behavior. For example, consider the format string foo %F. It formats as the lowercase string 'foo' and a non-locale-aware float; the returned string will be the same after strtolower(), so it should be valid as a lowercase-string.

However, in nsrt/lowercase-string-sprintf.php, if you add assertType('lowercase-string', sprintf('foo %F', $lowercase));, it will fail with:

Expected: lowercase-string
Actual:   non-falsy-string

That's because %F is uppercase, even though it represents a float.

If you consider these to be problems, they might be resolved by adding a new type, printf-string, that can ignore the formatting specifiers when determining isLowercase() and isUppercase(), but I imagine that would have rather broad implcations.

Let me know how you'd like to proceed.

@pmjones pmjones changed the base branch from 2.0.x to 1.12.x November 8, 2024 19:16
Copy link
Member

@ondrejmirtes ondrejmirtes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very complete and very nice, thank you! Just one thing I noticed.

@ondrejmirtes
Copy link
Member

One more thing - Windows tests are failing https://github.com/phpstan/phpstan-src/actions/runs/11749776276/job/32736693637?pr=3613

@pmjones
Copy link
Contributor Author

pmjones commented Nov 9, 2024

One more thing - Windows tests are failing https://github.com/phpstan/phpstan-src/actions/runs/11749776276/job/32736693637?pr=3613

Yes; it's related to this change where two reasons are given instead of one, and the PHP_EOL between them causes the Windows failures.

After thinking about it some more, I believe it should remain as one reason (not two) and am trying to track down where the second reason is coming from. Any help there would be appreciated!

Copy link
Member

@ondrejmirtes ondrejmirtes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few more things.

@ondrejmirtes
Copy link
Member

After you rebase on top of 1.12.x with this commit ccfb4ab, the rule error tip should appear only once.

@pmjones
Copy link
Contributor Author

pmjones commented Nov 10, 2024

After you rebase on top of 1.12.x with this commit ccfb4ab, the rule error tip should appear only once.

That did the trick; thanks!

Copy link
Contributor

@VincentLanglet VincentLanglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should be copied too. #3510

IntegerRangeType::toString and IntegerType::toString are both lowercase-string and uppercase-string.

And a whole reflection on lowercase-string&uppercase-string seems needed.

Cause if this intersection is lost the current code

/** @param lowercase-string $string */
public function acceptsOnlyLowercase($string) {}

/** @var int $int */
$int = ...;

acceptsOnlyLowercase((string) $int); // Will report an error

and doesn't actually cf
https://phpstan.org/r/e722e920-af05-4c7b-8290-1205df68c4f8

@ondrejmirtes
Copy link
Member

In this case we need to verify that TypeCombinator::intersect() of both lowercase-string and uppercase-string leads to expected result. And both types need to tell that it's possible it can be the other too.

Should be done in TypeCombinatorTest.

pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Addresses phpstan#3613 (comment) and phpstan#3613 (comment)

The relevate lowercase-string tests do not seem to be affected ... ?
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
Copy link
Contributor

@VincentLanglet VincentLanglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current ToDo-list seems to be:

pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
Addresses phpstan#3613 (comment), phpstan#3613 (comment), phpstan#3613 (comment)

This intentionally causes tests to fail so @VincentLanglet et al. can debug.
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
pmjones added a commit to pmjones/phpstan-src that referenced this pull request Nov 18, 2024
Addresses phpstan#3613 (comment)

Instead of a 4-part if/elseif/else, this builds both lowercase and/or uppercase; then if neither was true, builds what would have been the final else case.
Addresses phpstan#3613 (comment), phpstan#3613 (comment), phpstan#3613 (comment)

This intentionally causes tests to fail so @VincentLanglet et al. can debug.
Addresses phpstan#3613 (comment)

Instead of a 4-part if/elseif/else, this builds both lowercase and/or uppercase; then if neither was true, builds what would have been the final else case.
Copy link
Contributor

@VincentLanglet VincentLanglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need lowercase-string|uppercase-string to be simplified to string

@pmjones
Copy link
Contributor Author

pmjones commented Nov 19, 2024

@VincentLanglet "This PR should be copied too. https://github.com/phpstan/phpstan-src/pull/3510"

That one is merged into 1.12.x, and I have rebased since that PR, so I think it has been included now. Let me know if I am mistaken.

@pmjones
Copy link
Contributor Author

pmjones commented Nov 19, 2024

@VincentLanglet "We also need lowercase-string|uppercase-string to be simplified to string" -- I admit I do not know where to accomplish this. Hints/tips?

@VincentLanglet
Copy link
Contributor

@VincentLanglet "This PR should be copied too. https://github.com/phpstan/phpstan-src/pull/3510"

That one is merged into 1.12.x, and I have rebased since that PR, so I think it has been included now. Let me know if I am mistaken.

You need to update

  • IntegerType::toString, it returns
return new IntersectionType([
			new StringType(),
			new AccessoryLowercaseStringType(),
			new AccessoryUppercaseStringType(),
			new AccessoryNumericStringType(),
		]);
  • IntegerRangeType::toString, it returns
$finiteTypes = $this->getFiniteTypes();
		if ($finiteTypes !== []) {
			return TypeCombinator::union(...$finiteTypes)->toString();
		}
		$isZero = (new ConstantIntegerType(0))->isSuperTypeOf($this);
		if ($isZero->no()) {
			return new IntersectionType([
				new StringType(),
				new AccessoryLowercaseStringType(),
			    new AccessoryUppercaseStringType(),
				new AccessoryNumericStringType(),
				new AccessoryNonFalsyStringType(),
			]);
		}

		return new IntersectionType([
			new StringType(),
			new AccessoryLowercaseStringType(),
			new AccessoryUppercaseStringType(),
			new AccessoryNumericStringType(),
		]);

@VincentLanglet
Copy link
Contributor

VincentLanglet commented Nov 19, 2024

@VincentLanglet "We also need lowercase-string|uppercase-string to be simplified to string" -- I admit I do not know where to accomplish this. Hints/tips?

Might need to be validated with @ondrejmirtes, but I think it's in TypeCombinator::union or TypeCombinator::compareTypesInUnion

But it seems not easy to do cause
lowercase-string|uppercase-string need to be simplified to string
literal-string&lowercase-string|literal-string&uppercase-string need to be literal-string (same for other accessory)
literal-string&lowercase-string|uppercase-string should stay untouched (i think).

Depends on how hard it is to handle, and how important it is to solve ; it might be postponed in another PR.

Except my previous comment, I don't think I see anything to add on this PR ; good job

@ondrejmirtes ondrejmirtes merged commit ca2c937 into phpstan:1.12.x Nov 20, 2024
452 checks passed
@ondrejmirtes
Copy link
Member

I reviewed the whole PR again and I like it a lot! Thank you very much.

If you wanted more challenges in the type system now that you proficient in it, there's no shortage of them I can throw at you :)

@staabm
Copy link
Contributor

staabm commented Nov 20, 2024

great job!

@VincentLanglet
Copy link
Contributor

I reviewed the whole PR again and I like it a lot! Thank you very much.

If you wanted more challenges in the type system now that you proficient in it, there's no shortage of them I can throw at you :)

I had a comment left about IntegerType::toString which is both lowercase and uppercase.

I made the PR #3651

@pmjones
Copy link
Contributor Author

pmjones commented Nov 20, 2024

Thanks all, glad it turned out well!

@thg2k
Copy link
Contributor

thg2k commented Nov 22, 2024

Do you think it would be doable to introduce something like html-string and sql-string? Wouldn't be too different from uppercase/lowercase-string. Idea being: mysqli_real_escape_string(string): sql-string, and htmlspecialchars(string): html-string, then concat would preserve the semantics (does it?) and stuff like that... The idea being that then mysqli_query(sql-string) and all output(html-string).. just gathering some preliminary input, i'd open a separate issue to discuss that.

@ondrejmirtes
Copy link
Member

@thg2k I plan to do "nominal string type". Which means you'll be able to do whatever-string without any special meaning to PHPStan. But whatever-string will only accept another whatever-string. And it will be user's job to write their @phpstan-assert PHPDocs to narrow the types.

@ondrejmirtes
Copy link
Member

The implementation would look very similar to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants