Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge TDSParserStateObject.StateSnapshot packet list mangement and reimplment #2164

Merged
merged 1 commit into from Oct 25, 2023

Conversation

Wraith2
Copy link
Contributor

@Wraith2 Wraith2 commented Sep 20, 2023

In the async state snapshot netfx and netcore currently use different approaches. Netfx uses a List<PacketData> and int counter to identify the current location as it moves forward. Netcore uses a singly linked list and a counter for when it needs to know the current position which is calculated by iterating. Both approaches have different perf issues, the list approach causes repeated array expansions when accumulating packets and the singly linked list causes an unpleasant cpu perf scaling issue when it iterates to calculate the position.

This PR replaces both methods with a shared doubly linked list. The double linking allows easy access to the start of the list for iteration, the end of the list for accumulation and easy navigation in either direction which will be useful later when continue mode is added.

This PR also makes some other changes in this area:

  1. fixes the stack trace checking feature so it if is ever used it will work. I can't see us using this feature currently.
  2. changes some function names to treat the snapshot more like an enumerable thing and uses MoveNext to advance.
  3. adds debugger visualization support for packets (will not be covered since it is vs only) and the ability to identify each packet uniquely in debug mode.

The changes yield good perf results for async strings. They effectively reduce the scaling factor that is applied by the repeated replay of the packet list and keep the memory footprint almost identical to the current async version. Note that there is still an unfavourable scaling factor for async over sync, this does not linearize the operation.

BDN perf results, Async-cm31 is current main branch, The table lists the sizes from 1 to 20 meg in interesting increments.

size sync async-cm31 async-cm32
1 5.692 10.382 9.228
5 19.78 293.99 96.88
10 39.5 2,297.18 495.26
15 53.92 9,409.01 1,401.61
20 80.46 28,579.30 2,867.15

image

@codecov
Copy link

codecov bot commented Sep 20, 2023

Codecov Report

Patch coverage is 96.66% of modified lines.

Files Changed Coverage
...c/Microsoft/Data/SqlClient/TdsParserStateObject.cs 96.15%
...oft/Data/SqlClient/TdsParserStateObject.netcore.cs 100.00%
...osoft/Data/SqlClient/TdsParserStateObject.netfx.cs 100.00%

📢 Thoughts on this report? Let us know!.

@panoskj
Copy link
Contributor

panoskj commented Sep 24, 2023

Good job, this is a much needed change.

I assume cm31 is the netcore version, but I am really curious about how netfx version would perform in your benchmark.

You also mentioned 1-20 MB tables, but in reality it is the column/cell size that matters, would be worth mentioning it too.

Finally, is there any reason you didn't use System.Collections.Generic.LinkedList?

For more context:

What happens in the async version is that, each time a packet arrives, the parser attempts to parse the next column. If the column is not parsed successfully (e.g. a large string column for which some packets have not being received yet), the packet is kept in a list (the list affected by this PR), so that the whole process can be repeated as soon as the next packet arrives.

Obviously a lot of work is repeated, that's why the parsing algorithm doesn't scale. A scalable implementation would either save its progress for each packet or estimate how many packets are needed in advance (to try parsing them just once).

Anyway, if we try to read a 8MB column, the current algorithm has to split it into 1024 packets of 8KB each (assuming 8KB default packet size). For the 1024 packets of this example, an array/list of just 8KB (assuming 8 bytes per reference) would be enough. So the problem of the netfx version that @Wraith2 mentioned (repeated array expansions) could be rectified by pre-allocating a little bit more memory (which would be nothing compared to the total size of the buffered packets). This array/list could also be re-used - it only needs to be created once (apparently netfx didn't re-use it, until parts of #198 were ported to netfx recently by #2144). This approach (netfx) should minimize GC pressure as well, if PacketData becomes a struct.

Long story short, it may be a bit hard to follow my reasoning here, but I am really curious how netfx version would perform in your benchmark, especially if you pre-allocate some more memory for the list and make PacketData a struct.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Sep 24, 2023

The sizes I listed were a single field in a table, so 1-20 mib single strings.

How would the netfx version fare per-wise? probably only very slightly slower than the doubly linked list with the speed difference being caused by gc. The repeated array allocations were the reason that it was changed from an array to the linked list, the perf disadvantage of that wasn't immediately apparent. Remember that people haven't been complaining about the netfx version only netcore.

Why didn't I use LinkedList<T> ? because I don't need most of the functions it provides.

@DavoudEshtehari DavoudEshtehari added this to the 5.2.0-preview4 milestone Oct 24, 2023
@DavoudEshtehari DavoudEshtehari added 💡 Enhancement New feature request ➕ Code Health Changes related to source code improvements labels Oct 24, 2023
@JRahnama JRahnama merged commit b86ddf1 into dotnet:main Oct 25, 2023
132 checks passed
@Wraith2 Wraith2 deleted the combine-32 branch October 25, 2023 10:45
@AlmightyLks
Copy link

AlmightyLks commented Dec 7, 2023

When can 5.2.0-preview4 be expected to release? Or rather, the full 5.2.0 release 😄

@David-Engel
Copy link
Contributor

@AlmightyLks preview 4 is out. The 5.2 GA estimate can be viewed on the milestones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 Enhancement New feature request ➕ Code Health Changes related to source code improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants