Skip to content

Commit

Permalink
Merge pull request #6 from apache/sphinx-docs
Browse files Browse the repository at this point in the history
First attempt at sphinx documentation
  • Loading branch information
AlexanderSaydakov committed Sep 12, 2023
2 parents 7b3843a + 621b268 commit 6d8b636
Show file tree
Hide file tree
Showing 15 changed files with 336 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .asf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
github:
homepage: https://datasketches.apache.org
ghp_branch: gh-pages
ghp_path: /docs
32 changes: 32 additions & 0 deletions .github/workflows/sphinx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Sphinx

on:
push:
branches:
- main
workflow_dispatch:

jobs:
build-documentation:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Install Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- name: Install Datasketches and Sphinx
run: python -m pip install . sphinx==7.2.4 sphinx-rtd-theme
- name: Run Sphinx
run: cd docs; make html
- name: Pages Deployment
uses: peaceiris/actions-gh-pages@v3.9.3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs/build/html
destination_dir: docs/${{ github.ref_name }}
enable_jekyll: false
allow_empty_commit: false
force_orphan: false
publish_branch: gh-pages
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
50 changes: 50 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Follow these steps to build the documentation.
1. Clone the directory in an appropriate location `git clone https://github.com/apache/datasketches-python.git`
2. Switch to the correct branch: `git checkout python-docs`.
3. In project root run `source python-docs-venv/bin/activate`

If there are problems running the virtual env then you may need to install `virtualenv`
and install the packages manually as below
(nb my environment has `python` aliased to `python3` so just use whichever is appropriate for your installation)
```
python -m venv python-docs-venv # create a new virtual env named python-docs-venv
source python-docs-venv/bin/activate
python -m pip install sphinx
python -m pip install sphinx-rtd-theme
```
4. In project root run `python3 -m pip install .` to build the python bindings.
5. Build and open the documentation:
```
cd python/docs
make html
open build/html/index.html
```

## Problems
The `density_sketch` and `tuple_sketch` are not yet included.
I have not included the file to avoid cluttering the PR with things that may not work.
You can easily include them by making a `density_sketch.rst` file in the same location as
all of the other `X.rst` files for the sketches and copying in the following:

```
Density Sketch
--------------
.. autoclass:: datasketches.density_sketch
:members:
:undoc-members:
.. autoclass:: datasketches.GaussianKernel
:members:
```
Additionally, you will need to add the below to `index.rst`
```
Density Estimation
##################
.. toctree::
:maxdepth: 1
density_sketch
```

35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
36 changes: 36 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

import sys
import os

# need to fix the paths so that sphinx can find the source code.
sys.path.insert(0, os.path.abspath("../../datasketches"))
sys.path.insert(0, os.path.abspath("../../src"))


project = 'datasketches'
copyright = '2023'
author = 'Apache Software Foundation'
release = '0.1'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = ["sphinx.ext.autodoc","sphinx.ext.autosummary"]

templates_path = ['_templates']
exclude_patterns = []



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
6 changes: 6 additions & 0 deletions docs/source/count_min_sketch.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CountMin Sketch
---------------

.. autoclass:: _datasketches.count_min_sketch
:members:
:undoc-members:
7 changes: 7 additions & 0 deletions docs/source/cpc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Compressed Probabilistic Counting (CPC)
---------------------------------------
The *Compressed Probabilistic Counting* sketch is a space-efficient method for estimating cardinalities of sets.

.. autoclass:: _datasketches.cpc_sketch
:members:
:undoc-members:
6 changes: 6 additions & 0 deletions docs/source/frequent_items.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Frequent Items
--------------

.. autoclass:: _datasketches.frequent_items_sketch
:members:
:undoc-members:
7 changes: 7 additions & 0 deletions docs/source/hyper_log_log.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
HyperLogLog (HLL)
-----------------
The HyperLogLog (HLL) sketch is a space-efficient method for estimating cardinalities of sets.

.. autoclass:: _datasketches.hll_sketch
:members:
:undoc-members:
73 changes: 73 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
.. datasketches documentation master file, created by
sphinx-quickstart on Tue Jul 25 11:04:59 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Apache DataSketches
=================================================

**DataSketches** are highly-efficient algorithms to analyze big data quickly.


Counting Distincts
##################
..
maxdepth: 1 means only the heading is printed in the contents
.. toctree::
:maxdepth: 1

hyper_log_log
cpc
theta

Frequency Estimation
##########################

.. toctree::
:maxdepth: 1

count_min_sketch


Frequent Items
##########################
This problem may also be known as **heavy hitters** or **TopK**

.. toctree::
:maxdepth: 1

frequent_items

Quantile Estimation
###################

.. toctree::
:maxdepth: 1

kll
req
quantiles_depr

.. note::

This project is under active development.


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`


.. .. automodule:: datasketches
.. :members:
.. .. automodule:: _datasketches
.. :members:
..
..
.. distinct_count
19 changes: 19 additions & 0 deletions docs/source/kll.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
KLL Sketch
----------

.. autoclass:: _datasketches.kll_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_doubles_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_items_sketch
:members:
:undoc-members:

19 changes: 19 additions & 0 deletions docs/source/quantiles_depr.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Quantiles Sketch (Deprecated)
-----------------------------
This is a deprecated quantiles sketch that is included for cross-language compatability.

.. autoclass:: _datasketches.quantiles_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_doubles_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_items_sketch
:members:
:undoc-members:
14 changes: 14 additions & 0 deletions docs/source/req.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Relative Error Quantiles (REQ) Sketch
-------------------------------------

.. autoclass:: _datasketches.req_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.req_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.req_items_sketch
:members:
:undoc-members:
8 changes: 8 additions & 0 deletions docs/source/theta.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Theta Sketch
------------
The *Theta Sketch* sketch is a space-efficient method for estimating cardinalities of sets.
It can also easily handle set operations (such as union, intersection, difference) while maintaining good accuarcy.

.. autoclass:: _datasketches.theta_sketch
:members:
:undoc-members:

0 comments on commit 6d8b636

Please sign in to comment.