Skip to content

simet.services.seeding

simet.services.seeding

SeedingService

Utilities for end-to-end reproducibility across Python, NumPy, and PyTorch.

This service sets seeds for Python's random, NumPy, and PyTorch (CPU & CUDA), configures CuDNN for deterministic behavior, and applies environment variables commonly recommended for reproducible runs.

set_global_seed staticmethod

set_global_seed(seed=42)

Set global RNG seeds and enable deterministic modes where possible.

Steps performed

1) Seed Python's random, NumPy, and PyTorch (CPU + CUDA, all devices). 2) Configure CuDNN for determinism (deterministic=True, benchmark=False). 3) Set environment variables: - PYTHONHASHSEED → ensure stable hashing in Python. - CUBLAS_WORKSPACE_CONFIG=":4096:8" → enforce determinism in cuBLAS. 4) Enable PyTorch deterministic algorithms: - Prefer torch.use_deterministic_algorithms(True). - Fallbacks and warn-only mode are attempted for older versions or when unsupported ops are encountered.

Parameters:

Name Type Description Default
seed int

The seed value to apply across libraries. Defaults to 42.

42
Notes
  • Performance trade-off: Deterministic CuDNN and algorithms can slow down training/inference and increase memory usage.
  • Coverage limits: Some PyTorch ops are inherently nondeterministic or do not have deterministic implementations. In such cases, PyTorch may raise a RuntimeError. This method catches it and attempts warn_only=True when available; otherwise it logs a warning.
  • Multi-GPU: Both torch.cuda.manual_seed and torch.cuda.manual_seed_all are set to cover multiple devices.
  • Version compatibility: On very old PyTorch versions, torch.use_deterministic_algorithms may be missing; a fallback to torch.set_deterministic(True) is attempted (deprecated in newer versions).
Example

SeedingService.set_global_seed(123)

All subsequent RNG calls should now be repeatable (within limits).
Source code in simet/services/seeding.py
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
@staticmethod
def set_global_seed(seed: int = 42) -> None:
    """Set global RNG seeds and enable deterministic modes where possible.

    Steps performed:
        1) Seed Python's `random`, NumPy, and PyTorch (CPU + CUDA, all devices).
        2) Configure CuDNN for determinism (`deterministic=True`, `benchmark=False`).
        3) Set environment variables:
           - `PYTHONHASHSEED` → ensure stable hashing in Python.
           - `CUBLAS_WORKSPACE_CONFIG=":4096:8"` → enforce determinism in cuBLAS.
        4) Enable PyTorch deterministic algorithms:
           - Prefer `torch.use_deterministic_algorithms(True)`.
           - Fallbacks and warn-only mode are attempted for older versions or
             when unsupported ops are encountered.

    Args:
        seed (int): The seed value to apply across libraries. Defaults to 42.

    Notes:
        - **Performance trade-off**: Deterministic CuDNN and algorithms can
          slow down training/inference and increase memory usage.
        - **Coverage limits**: Some PyTorch ops are inherently nondeterministic
          or do not have deterministic implementations. In such cases,
          PyTorch may raise a `RuntimeError`. This method catches it and
          attempts `warn_only=True` when available; otherwise it logs a warning.
        - **Multi-GPU**: Both `torch.cuda.manual_seed` and
          `torch.cuda.manual_seed_all` are set to cover multiple devices.
        - **Version compatibility**: On very old PyTorch versions,
          `torch.use_deterministic_algorithms` may be missing; a fallback to
          `torch.set_deterministic(True)` is attempted (deprecated in newer versions).

    Example:
        >>> SeedingService.set_global_seed(123)
        >>> # All subsequent RNG calls should now be repeatable (within limits).
    """
    logger.info(f"Setting global seed to {seed}")

    # Python's built-in random
    random.seed(seed)
    logger.debug(f"Python random seed set to {seed}")

    # NumPy
    np.random.seed(seed)
    logger.debug(f"NumPy random seed set to {seed}")

    # PyTorch
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)  # for multi-GPU setups
    logger.debug(f"PyTorch random seed set to {seed}")

    # PyTorch deterministic operations (may impact performance)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    logger.debug("PyTorch set to use deterministic algorithms where possible")

    # Set environment variables for additional reproducibility
    os.environ["PYTHONHASHSEED"] = str(seed)
    os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
    logger.debug("Environment variables for reproducibility set")

    # Enable PyTorch deterministic algorithms
    try:
        torch.use_deterministic_algorithms(True)
        logger.debug("PyTorch set to use deterministic algorithms")
    except AttributeError:
        # Fallback for older PyTorch versions
        logger.warning(
            "torch.use_deterministic_algorithms not available, falling back to torch.set_deterministic"
        )
        torch.set_deterministic(True)  # type: ignore
    except RuntimeError as e:
        # Some operations don't support deterministic mode
        logger.warning(f"Could not enable full deterministic mode: {e}")
        try:
            torch.use_deterministic_algorithms(True, warn_only=True)
            logger.debug(
                "PyTorch set to use deterministic algorithms with warnings only"
            )
        except (AttributeError, TypeError):
            logger.warning(
                "Could not set warn_only mode for deterministic algorithms"
            )
            pass

    logger.info(f"Global seed set to {seed} for reproducible results")