scikit-learn · glemaitre · Sep 20, 2023 · Sep 19, 2023 · Sep 19, 2023 · Sep 20, 2023
diff --git a/doc/modules/compose.rst b/doc/modules/compose.rst
@@ -54,9 +54,8 @@ The last estimator may be any type (transformer, classifier, etc.).
 Usage
 -----
 
-|details-start|
-**Construction**
-|details-split|
+Build a pipeline
+................
 
 The :class:`Pipeline` is built using a list of ``(key, value)`` pairs, where
 the ``key`` is a string containing the name you want to give this step and ``value``
@@ -70,6 +69,10 @@ is an estimator object::
     >>> pipe
     Pipeline(steps=[('reduce_dim', PCA()), ('clf', SVC())])
 
+|details-start|
+**Shortand version using :func:`make_pipeline`**
+|details-split|
+
 The utility function :func:`make_pipeline` is a shorthand
 for constructing pipelines;
 it takes a variable number of estimators and returns a pipeline,
@@ -81,14 +84,26 @@ filling in the names automatically::
 
 |details-end|
 
+Access pipeline steps
+.....................
+
+The estimators of a pipeline are stored as a list in the ``steps`` attribute.
+A sub-pipeline can be extracted using the slicing notation commonly used
+for Python Sequences such as lists or strings (although only a step of 1 is
+permitted). This is convenient for performing only some of the transformations
+(or their inverse):
+
+    >>> pipe[:1]
+    Pipeline(steps=[('reduce_dim', PCA())])
+    >>> pipe[-1:]
+    Pipeline(steps=[('clf', SVC())])
+
 |details-start|
-**Accessing steps**
+**Accessing a step by name or position**
 |details-split|
 
-
-The estimators of a pipeline are stored as a list in the ``steps`` attribute,
-but can be accessed by index or name by indexing (with ``[idx]``) the
-Pipeline::
+A specific step can also be accessed by index or name by indexing (with ``[idx]``) the
+pipeline::
 
     >>> pipe.steps[0]
     ('reduce_dim', PCA())
@@ -97,36 +112,61 @@ Pipeline::
     >>> pipe['reduce_dim']
     PCA()
 
-Pipeline's `named_steps` attribute allows accessing steps by name with tab
+`Pipeline`'s `named_steps` attribute allows accessing steps by name with tab
 completion in interactive environments::
 
     >>> pipe.named_steps.reduce_dim is pipe['reduce_dim']
     True
 
-A sub-pipeline can also be extracted using the slicing notation commonly used
-for Python Sequences such as lists or strings (although only a step of 1 is
-permitted). This is convenient for performing only some of the transformations
-(or their inverse):
+|details-end|
 
-    >>> pipe[:1]
-    Pipeline(steps=[('reduce_dim', PCA())])
-    >>> pipe[-1:]
-    Pipeline(steps=[('clf', SVC())])
+Tracking feature names in a pipeline
+....................................
 
-|details-end|
+To enable model inspection, :class:`~sklearn.pipeline.Pipeline` has a
+``get_feature_names_out()`` method, just like all transformers. You can use
+pipeline slicing to get the feature names going into each step::
 
-.. _pipeline_nested_parameters:
+    >>> from sklearn.datasets import load_iris
+    >>> from sklearn.feature_selection import SelectKBest
+    >>> iris = load_iris()
+    >>> pipe = Pipeline(steps=[
+    ...    ('select', SelectKBest(k=2)),
+    ...    ('clf', LogisticRegression())])
+    >>> pipe.fit(iris.data, iris.target)
+    Pipeline(steps=[('select', SelectKBest(...)), ('clf', LogisticRegression(...))])
+    >>> pipe[:-1].get_feature_names_out()
+    array(['x2', 'x3'], ...)
 
 |details-start|
-**Nested parameters**
+**Customize feature names**
 |details-split|
 
-Parameters of the estimators in the pipeline can be accessed using the
-``<estimator>__<parameter>`` syntax::
+You can also provide custom feature names for the input data using
+``get_feature_names_out``::
+
+    >>> pipe[:-1].get_feature_names_out(iris.feature_names)
+    array(['petal length (cm)', 'petal width (cm)'], ...)
+
+|details-end|
+
+.. _pipeline_nested_parameters:
+
+Access to nested parameters
+...........................
+
+It is common to adjust the parameters of an estimator within a pipeline. This parameter
+is therefore nested because it belongs to a particular sub-step. Parameters of the
+estimators in the pipeline are accessible using the ``<estimator>__<parameter>``
+syntax::
 
     >>> pipe.set_params(clf__C=10)
     Pipeline(steps=[('reduce_dim', PCA()), ('clf', SVC(C=10))])
 
+|details-start|
+**When does it matter?**
+|details-split|
+
 This is particularly important for doing grid searches::
 
     >>> from sklearn.model_selection import GridSearchCV
@@ -143,36 +183,11 @@ ignored by setting them to ``'passthrough'``::
     ...                   clf__C=[0.1, 10, 100])
     >>> grid_search = GridSearchCV(pipe, param_grid=param_grid)
 
-The estimators of the pipeline can be retrieved by index:
-
-    >>> pipe[0]
-    PCA()
-
-or by name::
-
-    >>> pipe['reduce_dim']
-    PCA()
-
-To enable model inspection, :class:`~sklearn.pipeline.Pipeline` has a
-``get_feature_names_out()`` method, just like all transformers. You can use
-pipeline slicing to get the feature names going into each step::
-
-    >>> from sklearn.datasets import load_iris
-    >>> from sklearn.feature_selection import SelectKBest
-    >>> iris = load_iris()
-    >>> pipe = Pipeline(steps=[
-    ...    ('select', SelectKBest(k=2)),
-    ...    ('clf', LogisticRegression())])
-    >>> pipe.fit(iris.data, iris.target)
-    Pipeline(steps=[('select', SelectKBest(...)), ('clf', LogisticRegression(...))])
-    >>> pipe[:-1].get_feature_names_out()
-    array(['x2', 'x3'], ...)
+.. topic:: See Also:
 
-You can also provide custom feature names for the input data using
-``get_feature_names_out``::
+ * :ref:`composite_grid_search`
 
-    >>> pipe[:-1].get_feature_names_out(iris.feature_names)
-    array(['petal length (cm)', 'petal width (cm)'], ...)
+|details-end|
 
 .. topic:: Examples:
 
@@ -184,11 +199,6 @@ You can also provide custom feature names for the input data using
  * :ref:`sphx_glr_auto_examples_compose_plot_compare_reduction.py`
  * :ref:`sphx_glr_auto_examples_miscellaneous_plot_pipeline_display.py`
 
-.. topic:: See Also:
-
- * :ref:`composite_grid_search`
-
-|details-end|
 
 .. _pipeline_cache:
 

diff --git a/doc/modules/feature_extraction.rst b/doc/modules/feature_extraction.rst
@@ -225,7 +225,7 @@ it is advisable to use a power of two as the ``n_features`` parameter;
 otherwise the features will not be mapped evenly to the columns.
 
 .. topic:: References:
-    
+
   * `MurmurHash3 <https://github.com/aappleby/smhasher>`_.
 
 |details-end|
@@ -398,9 +398,8 @@ last document::
 
 .. _stop_words:
 
-|details-start|
-**Using stop words**
-|details-split|
+Using stop words
+----------------
 
 Stop words are words like "and", "the", "him", which are presumed to be
 uninformative in representing the content of a text, and which may be
@@ -431,8 +430,6 @@ identify and warn about some kinds of inconsistencies.
                In *Proc. Workshop for NLP Open Source Software*.
 
 
-|details-end|
-
 .. _tfidf:
 
 Tf–idf term weighting