.. _parallelism: Parallel Processing =================== .. py:currentmodule:: lenskit.parallel LensKit supports various forms of parallel execution, each with an environment variable controlling its: - :doc:`Batch operations ` using either a thread pool (on free-threaded builds of Python 3.14 and later) or Ray (with the ``lenskit[ray]`` extra installed). - Parallel model training. For most models provided by LensKit, such parallelism uses Rayon in the Rust extension module. - Parallel inference, using Rayon in the Rust extension module. - Parallel computation in the various backends (BLAS, MKL, Torch, etc.), for both training and inference. .. versionchanged:: 2026.1 LensKit no longer uses multiprocessing or PyTorch JIT parallelism, and the configuration options in the next section have been revised. .. tip:: LensKit supports multithreaded operation with free-threaded Python 3.14 or newer. You can install such a Python with ``uv``: .. code:: console uv venv -p 3.14t If your Python is not free-threaded and you want to parallelize batch inference, you will need to also install Ray, and performance will likely not be quite as good. .. _parallel-config: Configuring Parallelism ~~~~~~~~~~~~~~~~~~~~~~~ LensKit provides several controls to configure its use of parallelism, each of which has a corresponding environment variable and parameter in the ``[parallel]`` section of :file:`lenskit.toml`, defined by :class:`~lenskit.config.ParallelSettings`. The default settings are based on the number of CPUs available on your system, capped to reduce the risk of runaway processes hogging an entire machine. Due to multiple multithreading mechanisms, a precise limit is difficult to specify, but with default settings LensKit's usage will stay under about 32 CPU cores, and will rarely use that many (depending on the components used). LensKit also tries to account for operating system CPU limits when detecting the number of available cores. Component implementations and support code that wants to make sure that parallelism is properly configured should call :py:func:`ensure_parallel_init` prior to performing any possibly-parallel computation. This function resolves the parallelism configuration (including any default values) and sets appropriate thread pool limits on compute backends. The configuration settings and their default values are: .. currentmodule:: lenskit.config .. _ray: https://docs.ray.io .. envvar:: LK_NUM_THREADS The number of threads LensKit will use in its own multithreaded code. This primarily controls the size of the Rayon thread pool used by the Rust accelerator, but will also be respected by future training code built on free-threading or other thread pool mechanisms. The default is 8 or the number of available CPU cores, whichever is lower. Configuration option: :attr:`~ParallelSettings.num_threads`. .. note:: Only the k-NN models currently use this at inference time. Several models use it at training time. .. envvar:: LK_NUM_BATCH_JOBS The number of concurrent batch inference tasks. This used by the pipeline batch-run code (see :ref:`batch`) to size the inference thread pool or to limit the number of concurrently-executing Ray tasks. The default is :attr:`~ParallelSettings.num_threads` if Python is free-threaded or Ray is enabled (see :envvar:`LK_USE_RAY`), or 1 on non-free-threaded Python without Ray. Configuration option: :attr:`~ParallelSettings.num_batch_jobs`. .. envvar:: LK_USE_RAY If set to 1, use Ray_ instead of a thread pool for batch inference. Requires Ray to be installed (use the Ray extra, ``lenskit[ray]``, to depend on a compatible version of ray). Configuration option: :attr:`~ParallelSettings.use_ray`. .. envvar:: LK_NUM_BACKEND_THREADS The number of threads to be used by backend compute engines such as BLAS. The configured or derived value is passed to :func:`torch.set_num_threads` (to control PyTorch internal parallelism), and to the underlying BLAS layer (via `threadpoolctl`_). The default is to use up to 4 backend threads per training thread, depending on the capacity of the machine:: max(min(NCPUS // LK_NUM_THREADS, 4), 1) Explicitly set this variable to -1 to keep LensKit from trying to configure compute backend thread pools and leave them with their default settings. Configuration option: :attr:`~ParallelSettings.num_backend_threads`. .. note:: If the ``OPENBLAS_NUM_THREADS`` or ``MKL_NUM_THREADS`` environment variables are set, LensKit does not configure the BLAS thread pool. It still tries to configure the PyTorch intra-op thread count. .. _threadpoolctl: https://github.com/joblib/threadpoolctl Debugging Parallelism and Performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``lenskit doctor`` CLI command inspects the configured environment, including parallelism configuration.