Skip to content

Commit

Permalink
k12: improve API and support multithreaded computation
Browse files Browse the repository at this point in the history
Use options style API, so that the common case is very simple:

    h := k12.NewDraft10()

but we can provide options elegantly:

    h := k12.NewDraft10(
        WithContext([]byte("some context")),
        WithWorkers(runtime.NumCPU()),
    )

Allows multithreaded computation with the WithWorkers() option.
On M2 Pro scales well with a few workers, but isn't able to
utilize all cores effectively. In fact, performs better with 8 workers
than with all 12:

    BenchmarkK12_100B-12          	 5191293	       227.9 ns/op	 438.71 MB/s
    BenchmarkK12_10K-12           	  104174	     11310 ns/op	 884.14 MB/s
    BenchmarkK12_100K-12          	   27028	     44298 ns/op	2257.42 MB/s
    BenchmarkK12_3M-12            	    1185	   1001610 ns/op	3271.53 MB/s
    BenchmarkK12_32M-12           	     121	   9887721 ns/op	3314.01 MB/s
    BenchmarkK12_327M-12          	      12	  98895496 ns/op	3313.40 MB/s
    BenchmarkK12_3276M-12         	       2	 993438417 ns/op	3298.44 MB/s
    BenchmarkK12x2_32M-12         	     204	   5839409 ns/op	5611.53 MB/s
    BenchmarkK12x2_327M-12        	      20	  57119269 ns/op	5736.77 MB/s
    BenchmarkK12x2_3276M-12       	       2	 572368062 ns/op	5724.99 MB/s
    BenchmarkK12x4_32M-12         	     375	   3187552 ns/op	10279.99 MB/s
    BenchmarkK12x4_327M-12        	      42	  28103838 ns/op	11659.62 MB/s
    BenchmarkK12x4_3276M-12       	       2	 552996646 ns/op	11851.07 MB/s
    BenchmarkK12x8_32M-12         	     463	   2617319 ns/op	12519.68 MB/s
    BenchmarkK12x8_327M-12        	      63	  17751681 ns/op	18459.10 MB/s
    BenchmarkK12x8_3276M-12       	       4	 297381292 ns/op	22037.70 MB/s
    BenchmarkK12xCPUs_32M-12      	     434	   2785918 ns/op	11762.01 MB/s
    BenchmarkK12xCPUs_327M-12     	      69	  17143785 ns/op	19113.63 MB/s
    BenchmarkK12xCPUs_3276M-12    	       4	 325314010 ns/op	20145.46 MB/s

We only reach 22GB/s (at 8x) instead of the lower bound of 33GB/s
expected with 10 performance cores.

Adds {Max,Next}WriteSize to suggest the caller how big to choose
their Write() calls.
  • Loading branch information
bwesterb committed Jun 4, 2023
1 parent 4da7865 commit 1cec358
Show file tree
Hide file tree
Showing 3 changed files with 448 additions and 115 deletions.

0 comments on commit 1cec358

Please sign in to comment.