Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First attempt at sphinx documentation #6

Merged
merged 6 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions .asf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
github:
homepage: https://datasketches.apache.org
ghp_branch: gh-pages
ghp_path: /docs
32 changes: 32 additions & 0 deletions .github/workflows/sphinx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Sphinx

on:
push:
branches:
- main
workflow_dispatch:

jobs:
build-documentation:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Install Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- name: Install Datasketches and Sphinx
run: python -m pip install . sphinx==7.2.4 sphinx-rtd-theme
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with luck we'll be able to remove the pinned version, or else use the fix in the issue about this: sphinx-doc/sphinx#11662 (comment)

- name: Run Sphinx
run: cd docs; make html
- name: Pages Deployment
uses: peaceiris/actions-gh-pages@v3.9.3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs/build/html
destination_dir: docs/${{ github.ref_name }}
enable_jekyll: false
allow_empty_commit: false
force_orphan: false
publish_branch: gh-pages
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
50 changes: 50 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Follow these steps to build the documentation.
1. Clone the directory in an appropriate location `git clone https://github.com/apache/datasketches-python.git`
2. Switch to the correct branch: `git checkout python-docs`.
3. In project root run `source python-docs-venv/bin/activate`

If there are problems running the virtual env then you may need to install `virtualenv`
and install the packages manually as below
(nb my environment has `python` aliased to `python3` so just use whichever is appropriate for your installation)
```
python -m venv python-docs-venv # create a new virtual env named python-docs-venv
source python-docs-venv/bin/activate
python -m pip install sphinx
python -m pip install sphinx-rtd-theme
```
4. In project root run `python3 -m pip install .` to build the python bindings.
5. Build and open the documentation:
```
cd python/docs
make html
open build/html/index.html
```

## Problems
The `density_sketch` and `tuple_sketch` are not yet included.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be an interaction with sphinx 7.2.5; 7.2.4 should work, and that's why it's pinned in the yaml.

we should probably look into including the packages in conf.py as a workaround, only because that seems like it should be mostly harmless.

I have not included the file to avoid cluttering the PR with things that may not work.
You can easily include them by making a `density_sketch.rst` file in the same location as
all of the other `X.rst` files for the sketches and copying in the following:

```
Density Sketch
--------------

.. autoclass:: datasketches.density_sketch
:members:
:undoc-members:

.. autoclass:: datasketches.GaussianKernel
:members:
```
Additionally, you will need to add the below to `index.rst`
```
Density Estimation
##################

.. toctree::
:maxdepth: 1

density_sketch
```

35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
36 changes: 36 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

import sys
import os

# need to fix the paths so that sphinx can find the source code.
sys.path.insert(0, os.path.abspath("../../datasketches"))
sys.path.insert(0, os.path.abspath("../../src"))


project = 'datasketches'
copyright = ''
jmalkin marked this conversation as resolved.
Show resolved Hide resolved
author = 'Apache Software Foundation'
release = '0.1'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = ["sphinx.ext.autodoc","sphinx.ext.autosummary"]

templates_path = ['_templates']
exclude_patterns = []



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
6 changes: 6 additions & 0 deletions docs/source/count_min_sketch.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CountMin Sketch
---------------

.. autoclass:: _datasketches.count_min_sketch
:members:
:undoc-members:
7 changes: 7 additions & 0 deletions docs/source/cpc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Compressed Probabilistic Counting (CPC)
---------------------------------------
The *Compressed Probabilistic Counting* sketch is a space-efficient method for estimating cardinalities of sets.

.. autoclass:: _datasketches.cpc_sketch
:members:
:undoc-members:
6 changes: 6 additions & 0 deletions docs/source/frequent_items.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Frequent Items
--------------

.. autoclass:: _datasketches.frequent_items_sketch
:members:
:undoc-members:
7 changes: 7 additions & 0 deletions docs/source/hyper_log_log.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
HyperLogLog (HLL)
-----------------
The HyperLogLog (HLL) sketch is a space-efficient method for estimating cardinalities of sets.

.. autoclass:: _datasketches.hll_sketch
:members:
:undoc-members:
73 changes: 73 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
.. datasketches documentation master file, created by
sphinx-quickstart on Tue Jul 25 11:04:59 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

Apache DataSketches
=================================================

**DataSketches** are highly-efficient algorithms to analyze big data quickly.


Counting Distincts
##################
..
maxdepth: 1 means only the heading is printed in the contents
.. toctree::
:maxdepth: 1

hyper_log_log
cpc
theta

Frequency Estimation
##########################

.. toctree::
:maxdepth: 1

count_min_sketch


Frequent Items
##########################
This problem may also be known as **heavy hitters** or **TopK**

.. toctree::
:maxdepth: 1

frequent_items

Quantile Estimation
###################

.. toctree::
:maxdepth: 1

kll
req
quantiles_depr

.. note::

This project is under active development.


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`


.. .. automodule:: datasketches
.. :members:

.. .. automodule:: _datasketches
.. :members:

..
..

.. distinct_count
19 changes: 19 additions & 0 deletions docs/source/kll.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
KLL Sketch
----------

.. autoclass:: _datasketches.kll_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_doubles_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.kll_items_sketch
:members:
:undoc-members:

19 changes: 19 additions & 0 deletions docs/source/quantiles_depr.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Quantiles Sketch (Deprecated)
-----------------------------
This is a deprecated quantiles sketch that is included for cross-language compatability.

.. autoclass:: _datasketches.quantiles_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_doubles_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.quantiles_items_sketch
:members:
:undoc-members:
14 changes: 14 additions & 0 deletions docs/source/req.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Relative Error Quantiles (REQ) Sketch
-------------------------------------

.. autoclass:: _datasketches.req_ints_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.req_floats_sketch
:members:
:undoc-members:

.. autoclass:: _datasketches.req_items_sketch
:members:
:undoc-members:
8 changes: 8 additions & 0 deletions docs/source/theta.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Theta Sketch
------------
The *Theta Sketch* sketch is a space-efficient method for estimating cardinalities of sets.
It can also easily handle set operations (such as union, intersection, difference) while maintaining good accuarcy.

.. autoclass:: _datasketches.theta_sketch
:members:
:undoc-members:
25 changes: 25 additions & 0 deletions requirements.txt
jmalkin marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
alabaster==0.7.13
Babel==2.12.1
certifi==2023.7.22
charset-normalizer==3.2.0
datasketches @ file:///Users/charlied/personal_dev/datasketches-python
docutils==0.18.1
idna==3.4
imagesize==1.4.1
Jinja2==3.1.2
MarkupSafe==2.1.3
numpy==1.25.2
packaging==23.1
Pygments==2.16.1
requests==2.31.0
snowballstemmer==2.2.0
Sphinx==7.2.5
sphinx-rtd-theme==1.3.0
sphinxcontrib-applehelp==1.0.7
sphinxcontrib-devhelp==1.0.5
sphinxcontrib-htmlhelp==2.0.4
sphinxcontrib-jquery==4.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.6
sphinxcontrib-serializinghtml==1.1.9
urllib3==2.0.4