Skip to content

Analysis Driver API

Hamilton Driver wrapper for composable analysis calculations.

Overview

The chronos_lab.analysis.driver module provides the AnalysisDriver class, a wrapper around Apache Hamilton's Driver that simplifies running analysis calculations with shared configuration, caching, and execution management.

Key Features:

  • Zero-config defaults - AnalysisDriver() works out of the box
  • Flexible execution - Multithreading or multiprocessing for symbol-level parallelization
  • Persistent caching - Hamilton's cache for expensive computations

API Reference

chronos_lab.analysis.driver.AnalysisDriver

AnalysisDriver(*, enable_cache: bool = False, cache_path: str = None, local_executor_type: Optional[str] = 'synchronous', remote_executor_type: str = 'multithreading', max_parallel_tasks: int = 5, enable_telemetry: bool = False)

Hamilton Driver wrapper for composable analysis calculations.

Manages Hamilton Driver instances for different calculation types with shared caching and execution configuration. Each calculation type gets its own Driver (built once, reused on subsequent calls). All calculations share the same cache directory for maximum efficiency.

Initialize AnalysisDriver with shared configuration.

Parameters:

Name Type Description Default
enable_cache bool

Enable Hamilton caching for expensive computations. Defaults to False.

False
cache_path str

Directory path for cache storage. If None, uses HAMILTON_CACHE_PATH from settings. If the setting is not set, raises ValueError.

None
local_executor_type Optional[str]

Local executor type.

'synchronous'
remote_executor_type str

Remote executor type for parallel processing. Options: 'multithreading' or 'multiprocessing'. Defaults to 'multithreading'.

'multithreading'
max_parallel_tasks int

Maximum number of parallel tasks for symbol-level processing. Defaults to 5.

5
enable_telemetry bool

Enable Hamilton telemetry data collection. Defaults to False.

False

detect_anomalies

detect_anomalies(ohlcv: Optional[DataFrame] = None, ohlcv_from_source: str = 'disabled', ohlcv_from_config: Dict[str, Any] = None, ohlcv_features_list: List[str] = None, use_adjusted: bool = True, isolation_forest_config: Dict[str, Any] = None, to_dataset: str = 'disabled', to_dataset_config: Dict[str, Any] = None, to_arcticdb: str = 'disabled', to_arcticdb_config: Dict[str, Any] = None, driver_config: Dict[str, Any] = None) -> Dict[str, Any]

Detect anomalies in OHLCV time series data using Isolation Forest.

Executes a Hamilton DAG that standardizes OHLCV data, computes features, applies Isolation Forest anomaly detection, and optionally persists results to datasets or ArcticDB. Supports multiple data sources (Yahoo Finance, Intrinio, ArcticDB) or direct DataFrame input.

Parameters:

Name Type Description Default
ohlcv Optional[DataFrame]

Pre-loaded OHLCV DataFrame with MultiIndex (date, symbol). Required if ohlcv_from_source is 'disabled'. Defaults to None.

None
ohlcv_from_source str

Data source for OHLCV retrieval. Options: 'disabled' (use ohlcv parameter), 'yfinance', 'intrinio', or 'arcticdb'. Defaults to 'disabled'.

'disabled'
ohlcv_from_config Dict[str, Any]

Configuration dictionary passed to the data source function. Required when ohlcv_from_source is not 'disabled'. Defaults to None.

None
ohlcv_features_list List[str]

List of feature names to compute from OHLCV data. Options: 'returns', 'volume_change', 'high_low_range', 'volatility'. Defaults to ['returns', 'volume_change', 'high_low_range'].

None
use_adjusted bool

Whether to use adjusted OHLCV columns (adj_close, etc.) if available. Defaults to True.

True
isolation_forest_config Dict[str, Any]

Configuration dictionary for sklearn's IsolationForest. Defaults to {'contamination': 0.02, 'random_state': 42, 'n_estimators': 200, 'max_samples': 250}.

None
to_dataset str

Whether to save anomaly results to a dataset. Options: 'disabled' or 'enabled'. Defaults to 'disabled'.

'disabled'
to_dataset_config Dict[str, Any]

Configuration for dataset output. Defaults to {'dataset_name': 'ohlcv_anomalies', 'ddb_dataset_ttl': 7}.

None
to_arcticdb str

Whether to save results to ArcticDB. Options: 'disabled' or 'enabled'. Defaults to 'disabled'.

'disabled'
to_arcticdb_config Dict[str, Any]

Configuration for ArcticDB output. Defaults to {'backend': 'LMDB', 'library_name': 'analysis', 'symbol_prefix': '', 'symbol_suffix': '_ohlcv_anomaly'}.

None
driver_config Dict[str, Any]

Additional configuration passed to the Hamilton Driver builder. Defaults to {}.

None

Returns:

Type Description
Dict[str, Any]

Dictionary containing execution results with keys 'analysis_result'

Dict[str, Any]

(DataFrame with anomaly scores and flags), 'analysis_to_dataset'

Dict[str, Any]

(dataset save status), and 'analysis_to_arcticdb' (ArcticDB save status).

Raises:

Type Description
ValueError

If neither ohlcv nor ohlcv_from_source is provided, or if ohlcv_from_source is unsupported, or if ohlcv_from_config is missing when required.