Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test][tvos] Disable InvalidIVSizes overflow case for tvos #76725

Merged
merged 2 commits into from
Oct 11, 2022

Conversation

mdh1418
Copy link
Member

@mdh1418 mdh1418 commented Oct 6, 2022

This test has flakily caused app crashes on tvOS arm64 lanes. dotnet/arcade#11167
When the app crashes, it was observed that of the last tests ran, there commonly was only 2 InvalidIVSizes see
https://gist.github.com/mdh1418/563ce4066e16dfee055e0903e2c70a1e
https://gist.github.com/mdh1418/ed11c7ba361c3fdb8906e57034c58f90
https://gist.github.com/mdh1418/c831b807dc0d949bc01cdeee61a1d795

Whereas on a successful test suite run, there are 3.
https://gist.github.com/mdh1418/912914871d580475751d460719624224

Its suspected that the last test case is problematic, so disabling to make CI cleaner

@mdh1418 mdh1418 added the os-tvos Apple tvOS label Oct 6, 2022
@ghost
Copy link

ghost commented Oct 6, 2022

Tagging subscribers to 'os-tvos': @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

Issue Details

This test has flakily caused app crashes on tvOS arm64 lanes. dotnet/arcade#11167
When the app crashes, it was observed that of the last tests ran, there commonly was only 2 InvalidIVSizes see
https://gist.github.com/mdh1418/563ce4066e16dfee055e0903e2c70a1e
https://gist.github.com/mdh1418/ed11c7ba361c3fdb8906e57034c58f90
https://gist.github.com/mdh1418/c831b807dc0d949bc01cdeee61a1d795

Whereas on a successful test suite run, there are 3.
https://gist.github.com/mdh1418/912914871d580475751d460719624224

Its suspected that the last test case is problematic, so disabling to make CI cleaner

Author: mdh1418
Assignees: -
Labels:

os-tvos

Milestone: -

@ghost
Copy link

ghost commented Oct 6, 2022

Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

This test has flakily caused app crashes on tvOS arm64 lanes. dotnet/arcade#11167
When the app crashes, it was observed that of the last tests ran, there commonly was only 2 InvalidIVSizes see
https://gist.github.com/mdh1418/563ce4066e16dfee055e0903e2c70a1e
https://gist.github.com/mdh1418/ed11c7ba361c3fdb8906e57034c58f90
https://gist.github.com/mdh1418/c831b807dc0d949bc01cdeee61a1d795

Whereas on a successful test suite run, there are 3.
https://gist.github.com/mdh1418/912914871d580475751d460719624224

Its suspected that the last test case is problematic, so disabling to make CI cleaner

Author: mdh1418
Assignees: -
Labels:

area-System.Security, os-tvos

Milestone: -

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 6, 2022

/azp run runtime-ioslike

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 6, 2022

I'm planning to rerun the tvOS arm64 lane multiple times to see whether or not the System.Security.Cryptograph.Tests suite crashes

@steveisok
Copy link
Member

@bartonjs this seems to happen somewhat randomly. The logs aren't capturing the crash with any detail. Can you speculate as to what might be happening?

@vcsjones
Copy link
Member

vcsjones commented Oct 6, 2022

If this is indeed the culprit and we need to unblock CI here can we use an ActiveIssue and open an issue instead of a conditional to figure out why it's failing?

I can look at this later.

EDIT: oh it's a theory. I guess you can't put an ActiveIssue on just one case. I still think we should open a tracking issue and put that link in the skip exception.

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 6, 2022

@vcsjones I'm not certain which test is the actual one causing the crash so I'll create an issue and add it after running it a few times to ensure that its no longer crashing if that test case is skipped

Edit: Actually, I'll add the issue and link it in because I forgot to include the XUnitExtensions import anyways.

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 6, 2022

/azp run runtime-ioslike

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@bartonjs
Copy link
Member

bartonjs commented Oct 6, 2022

Can you speculate as to what might be happening?

I don't think there's anything wrong with the crypto code, but rather the runtime or the test harness/runner.

All the test does is:

  • Create a gigantic array (536870928 bytes).
    • If that throws OutOfMemoryException, OK.
  • Call Aes.Create().
    • Maybe we've pushed the limit here and we're OOMing, but the object isn't very big, so it seems unlikely. In that case I'd expect the test runner to still catch/log the OOM.
  • Pass the gigantic array to CreateEncryptor.
    • That clones the array (another 536870928 bytes, making for a total test demand of 1,073,741,856 bytes), then it's shown to be the wrong size and discarded. (literally using Object.Clone(), if that matters)
      • If this threw OOM the test would pass.
      • Then the cloned array goes out of scope and can be GC'd.
  • Pass the gigantic array to CreateDecryptor
    • Same as with CreateEncryptor, we get the peak demand of just over 1GB, and an OOM would pass.

So it seems likely that something during the execution of this test is falling over due to a low memory condition.

Yeah, we could change the product code to notice that the array is the wrong size before cloning it, but that doesn't seem to address that we have either a CLR problem where the system doesn't tolerate low memory properly or a test harness problem where we're not capable of capturing this condition/state.

@build-analysis build-analysis bot mentioned this pull request Oct 7, 2022
2 tasks
@mdh1418
Copy link
Member Author

mdh1418 commented Oct 7, 2022

/azp run runtime-ioslike

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 7, 2022

/azp run runtime-ioslike

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 7, 2022

Of the 3 runs of runtime-ioslike which runs tvOS arm64's System.Security.Cryptography.Tests suite, all have not crashed and the test is successfully ignored.

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 7, 2022

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@akoeplinger
Copy link
Member

I think we've seen similar issues in other test cases where gigantic arrays / OOM are involved, it looks like a more general Mono issue on low-memory devices. I'm fine with this as a quick fix but we should probably investigate more about the root cause.

@mdh1418
Copy link
Member Author

mdh1418 commented Oct 10, 2022

The remaining failures on runtime-extra-platforms are
#76830
dotnet/arcade#11210
dotnet/xharness#964
#76501

@steveisok steveisok merged commit 7215b38 into dotnet:main Oct 11, 2022
@steveisok steveisok deleted the tvos_skip_flaky_InvalidIVSizes branch October 11, 2022 16:16
@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants