Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small parameter encoding improvement #1296

Merged
merged 2 commits into from
Mar 30, 2023
Merged

Conversation

rusher
Copy link
Contributor

@rusher rusher commented Mar 20, 2023

Some types of parameters encoded as "text" or "length encoded text" are guaranteed to be ASCII. Using the ASCII encoder provides a significant performance improvement.

@bgrainger
Copy link
Member

Using the ASCII encoder provides a significant performance improvement.

Do you have a benchmark that quantifies the size of the improvement?

Note that Encoding.UTF8.GetEncoder() is cached in the existing code, but Encoding.ASCII.GetEncoder() isn't (in the PR), making me wonder if this PR creates additional garbage (that could impact performance)?

@@ -126,6 +129,30 @@ public void Write(ReadOnlySpan<char> chars, bool flush)
}
}

public void WriteAscii(ReadOnlySpan<char> chars, bool flush)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever called with flush: false? If not, the parameter could be removed and the code simplified.

Reallocate(chars.Length);
while (chars.Length > 0)
{
encoder.Convert(chars, m_output.Span, false, out var charsUsed, out var bytesUsed, out var completed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flush parameter could very likely be true here, because we're assuming the input string contains only ASCII characters, which convert one-to-one to bytes.

var encoder = Encoding.ASCII.GetEncoder();
if (m_output.Length < chars.Length)
Reallocate(chars.Length);
while (chars.Length > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this loop should be unnecessary, because ASCII characters convert to bytes one-to-one and the output buffer is resized to be large enough on the line above.

@bgrainger
Copy link
Member

FWIW, you can git commit --amend and git push --force to update the commit on an existing PR (if you want to make an edit to a PR that's already open).

@bgrainger
Copy link
Member

I ran the following BenchmarkDotNet test:

[Params("short", "a long ASCII string that's longer")]
public string Text { get; set; }

static Encoder s_utf8 = System.Text.Encoding.UTF8.GetEncoder();
static Encoder s_ascii = System.Text.Encoding.ASCII.GetEncoder();
static byte[] s_bytes = new byte[200];

[Benchmark(Baseline = true)]
public int CachedUtf8() => s_utf8.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);

[Benchmark]
public int CachedAscii() => s_ascii.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);

[Benchmark]
public int Ascii()
{
	var encoder = System.Text.Encoding.ASCII.GetEncoder();
	return encoder.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);
}
BenchmarkDotNet=v0.13.5, OS=Windows 10 (10.0.19045.2670/22H2/2022Update)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.100-preview.2.23157.25
  [Host] : .NET 7.0.4 (7.0.423.11508), X64 RyuJIT AVX2
Method Text Mean Error StdDev Ratio RatioSD
CachedUtf8 a lon(...)onger [33] 21.74 ns 0.393 ns 0.349 ns 1.00 0.00
CachedAscii a lon(...)onger [33] 20.32 ns 0.433 ns 0.724 ns 0.93 0.05
Ascii a lon(...)onger [33] 29.80 ns 0.570 ns 0.533 ns 1.37 0.03
CachedUtf8 short 17.96 ns 0.388 ns 0.518 ns 1.00 0.00
CachedAscii short 15.05 ns 0.332 ns 0.382 ns 0.84 0.03
Ascii short 23.96 ns 0.478 ns 1.233 ns 1.39 0.07

It looks like using ASCII encoding would be 5-10% faster for the encoding itself, and likely even faster if the complicated looping logic (for UTF-8 surrogate characters) was omitted from the ASCII version. However, this only occurs if the Encoder is cached; if a new one is created every time, using ASCII is slower.

@bgrainger
Copy link
Member

I tested another approach of just converting the chars to bytes:

[Benchmark]
public int Loop()
{
	var span = Text.AsSpan();
	var output = s_bytes.AsSpan();
	var i = 0;
	foreach (var ch in span)
		output[i++] = ch is >= (char)0x20 and <= (char)0x7F ? (byte) ch : (byte) '?';
	return i;
}

It's significantly faster for short strings, but surprisingly slow for longer ones:

Method Text Mean Error StdDev Ratio RatioSD
CachedUtf8 a lon(...)onger [33] 19.251 ns 0.2851 ns 0.2527 ns 1.00 0.00
CachedAscii a lon(...)onger [33] 19.381 ns 0.4126 ns 0.5218 ns 1.00 0.03
Ascii a lon(...)onger [33] 28.755 ns 0.5946 ns 0.9431 ns 1.48 0.06
Loop a lon(...)onger [33] 51.139 ns 0.8243 ns 0.7711 ns 2.66 0.05
CachedUtf8 short 17.090 ns 0.2262 ns 0.2005 ns 1.00 0.00
CachedAscii short 15.865 ns 0.3395 ns 0.5184 ns 0.93 0.03
Ascii short 22.774 ns 0.3866 ns 0.3616 ns 1.33 0.02
Loop short 6.848 ns 0.1077 ns 0.0955 ns 0.40 0.01

If we expected that most of the strings would be short, then perhaps it would be better on average?

@bgrainger
Copy link
Member

If we avoid the range check (by assuming the input is truly all ASCII), then performance improves:

[Benchmark]
public int Loop()
{
	var span = Text.AsSpan();
	var output = s_bytes.AsSpan();
	var i = 0;
	foreach (var ch in span)
		output[i++] = unchecked((byte) ch);
	return i;
}
Method Text Mean Error StdDev Ratio RatioSD
CachedUtf8 a lon(...)onger [33] 19.838 ns 0.4264 ns 0.8113 ns 1.00 0.00
CachedAscii a lon(...)onger [33] 19.108 ns 0.2547 ns 0.2258 ns 0.97 0.04
Ascii a lon(...)onger [33] 26.106 ns 0.3199 ns 0.2835 ns 1.32 0.05
Loop a lon(...)onger [33] 24.771 ns 0.5208 ns 0.5572 ns 1.25 0.06
CachedUtf8 short 18.620 ns 0.2729 ns 0.2553 ns 1.00 0.00
CachedAscii short 14.202 ns 0.1102 ns 0.0977 ns 0.76 0.01
Ascii short 21.940 ns 0.2139 ns 0.2001 ns 1.18 0.02
Loop short 4.114 ns 0.1111 ns 0.1697 ns 0.22 0.01

@rusher
Copy link
Contributor Author

rusher commented Mar 22, 2023

Right, there is indeed some corrections to add :

  • ASCII encoder needs to be cached.
  • there is no use for flush parameter
  • since we are sure that string is ascii AND output byte array size is > to string size, no need to loop.

btw, i've check that using latin1 can make even better performance :
this is because compare to ascii encoder, dotnet then permits to use SSE2 hardware, permitting more effective copy, by bunch of 8 bytes :

benchmark result

|      Method |                 Text |      Mean |     Error |
|------------ |--------------------- |----------:|----------:|
|  CachedUtf8 |                 1234 | 10.927 ns | 0.0709 ns |
| CachedAscii |                 1234 | 10.020 ns | 0.0872 ns |
| CachedLatin |                 1234 |  9.668 ns | 0.1090 ns |
|        Loop |                 1234 |  3.041 ns | 0.0366 ns |
|             |                      |           |           |
|  CachedUtf8 | 1970-(...)00000 [26] | 13.105 ns | 0.0968 ns |
| CachedAscii | 1970-(...)00000 [26] | 12.456 ns | 0.0602 ns |
| CachedLatin | 1970-(...)00000 [26] | 11.637 ns | 0.0718 ns |
|        Loop | 1970-(...)00000 [26] | 19.641 ns | 0.1691 ns |

(tested with code :

    [Params("1234", "1970-01-01 00:00:00.000000")]
    public string Text { get; set; }
    

    static Encoder s_utf8 = System.Text.Encoding.UTF8.GetEncoder();
    static Encoder s_ascii = System.Text.Encoding.ASCII.GetEncoder();
    static Encoder s_latin = System.Text.Encoding.Latin1.GetEncoder();
    static byte[] s_bytes = new byte[200];

    [Benchmark(Baseline = true)]
    public int CachedUtf8() => s_utf8.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);
    
    [Benchmark]
    public int CachedAscii() => s_ascii.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);
    
    
    [Benchmark]
    public int CachedLatin() => s_latin.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);

    //
    // [Benchmark]
    // public int Ascii()
    // {
    //     var encoder = System.Text.Encoding.ASCII.GetEncoder();
    //     return encoder.GetBytes(Text.AsSpan(), s_bytes.AsSpan(), true);
    // }
    
    [Benchmark]
    public int Loop()
    {
        var span = Text.AsSpan();
        var output = s_bytes.AsSpan();
        var i = 0;
        foreach (var ch in span)
            output[i++] = unchecked((byte) ch);
        return i;
    }

In this specific case, it might be even a better solution, so I'll change methods name and implementation accordingly.

This will be use mostly for decimal and datetime, so i think ASCII/latin1 encoder is preferable to loop.

@bgrainger
Copy link
Member

The ASCII encoder also appears to be vectorised: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs#L1155

Since the timing is close (I don't know why Latin1 benchmarks as slightly faster?) I would prefer to use the ASCII encoder as it matches the expected range of the input data better (i.e., there won't be any non-ASCII characters that need to be encoded as Latin1).

@rusher
Copy link
Contributor Author

rusher commented Mar 23, 2023

change submitted.

(for info: latin1 difference compare to ASCII https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Text/Latin1Utility.cs#L537 )

@@ -226,7 +237,8 @@ private void Reallocate(int additional = 0)
m_output = new(m_buffer, usedLength, m_buffer.Length - usedLength);
}

private Encoder? m_encoder;
private static Encoder m_utf8Encoder = Encoding.UTF8.GetEncoder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type isn't thread-safe and can't be shared across multiple instances of ByteBufferWriter. In particular, "An Encoder object maintains state information between successive calls to GetBytes or Convert methods so that it can correctly encode character sequences that span blocks."

Data would very likely be corrupted if m_utf8Encoder was used simultaneously by multiple threads.

@@ -226,7 +237,8 @@ private void Reallocate(int additional = 0)
m_output = new(m_buffer, usedLength, m_buffer.Length - usedLength);
}

private Encoder? m_encoder;
private static Encoder m_utf8Encoder = Encoding.UTF8.GetEncoder();
private static Encoder m_asciiEncoder = Encoding.ASCII.GetEncoder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is used to perform a "one-shot" conversion (from UTF-16 to ASCII) I don't think the documentation provides any guarantee that it's safe to do that simultaneously from multiple threads. Therefore, this should also not be static.

public static void WriteLengthEncodedAsciiString(this ByteBufferWriter writer, ReadOnlySpan<char> value)
{
writer.WriteLengthEncodedInteger((ulong) value.Length);
writer.Write(value, flush: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
writer.Write(value, flush: true);
writer.WriteAscii(value);

Did you mean WriteAscii here?

Comment on lines 240 to 241
private static Encoder m_utf8Encoder = Encoding.UTF8.GetEncoder();
private static Encoder m_asciiEncoder = Encoding.ASCII.GetEncoder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static Encoder m_utf8Encoder = Encoding.UTF8.GetEncoder();
private static Encoder m_asciiEncoder = Encoding.ASCII.GetEncoder();
private Encoder? m_utf8Encoder;
private Encoder? m_asciiEncoder;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, m_asciiEncoder probably isn't necessary... the Encoding.ASCII.GetBytes static method is probably sufficient since we're encoding the whole string at once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoding.ASCII.GetBytes is also much faster than a cached Encoder object:

Method Text Mean Error StdDev Ratio RatioSD
CachedUtf8 1234 16.584 ns 0.3541 ns 0.4349 ns 1.00 0.00
CachedAscii 1234 14.951 ns 0.1972 ns 0.1845 ns 0.91 0.03
Ascii 1234 5.975 ns 0.1181 ns 0.0986 ns 0.36 0.01
Loop 1234 3.002 ns 0.0853 ns 0.0798 ns 0.18 0.01
CachedUtf8 1970-(...)00000 [26] 19.331 ns 0.4160 ns 0.3891 ns 1.00 0.00
CachedAscii 1970-(...)00000 [26] 17.843 ns 0.2783 ns 0.2324 ns 0.92 0.02
Ascii 1970-(...)00000 [26] 9.683 ns 0.1879 ns 0.1569 ns 0.50 0.01
Loop 1970-(...)00000 [26] 20.023 ns 0.4302 ns 0.4955 ns 1.03 0.03

@rusher
Copy link
Contributor Author

rusher commented Mar 24, 2023

all goods remarks... I wouldn't have think a simple change like this would have taken so much exchanges.
I've confirm that Encoding.ASCII.GetBytes(ReadOnlySpan<char>, Span<bute>); is by far the best choice and force push the PR accordingly. I think this is now ok.

…re guaranteed to be ASCII. Using the ASCII encoder provides a significant performance improvement.

Signed-off-by: rusher <diego.dupin@gmail.com>
Co-authored-by: Bradley Grainger <bgrainger@gmail.com>
@bgrainger bgrainger merged commit d04e24a into mysql-net:master Mar 30, 2023
@bgrainger
Copy link
Member

Thanks for the improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants