Skip to content

ArcticDB Wrapper API

Low-level wrapper for direct ArcticDB operations.

Advanced API

This is a low-level API intended for advanced use cases. For most scenarios, use the high-level functions:

  • ohlcv_from_arcticdb() for reading data
  • ohlcv_to_arcticdb() for writing data

Overview

The chronos_lab.arcdb module provides the ArcDB class for direct control over ArcticDB operations, custom batch processing, and access to underlying ArcticDB APIs.

When to Use This API

  • Custom batch processing workflows
  • Fine-grained control over versioning
  • Direct access to ArcticDB Library API
  • Advanced query operations

When NOT to Use This API

  • Standard data fetching (use ohlcv_from_arcticdb())
  • Standard data storage (use ohlcv_to_arcticdb())
  • Simple read/write operations

Class

chronos_lab.arcdb.ArcDB

ArcDB(*, bucket_name=None, local_path=None, library_name)

Low-level wrapper for ArcticDB time series database operations.

Provides batch storage and retrieval operations with support for multiple storage backends (local LMDB, AWS S3, in-memory). Manages connection lifecycle and provides access to underlying ArcticDB API objects for advanced operations.

NOTE: This is a low-level class. For typical use cases, prefer high-level functions in chronos_lab.sources and chronos_lab.storage modules.

Attributes:

Name Type Description
_ac

Arctic instance (connection to storage backend). For advanced operations, see https://docs.arcticdb.io/dev/api/arctic/

_lib

Library instance (specific library within Arctic). For advanced operations, see https://docs.arcticdb.io/dev/api/library/

_bucket_name

S3 bucket name (if using S3 backend)

_local_path

Local filesystem path (if using LMDB backend)

_library_name

Name of the ArcticDB library

Examples:

Basic usage with local storage: >>> from chronos_lab.arcdb import ArcDB >>> import pandas as pd >>> >>> # Initialize connection >>> ac = ArcDB(library_name='my_data') >>> >>> # Store data >>> data = {'AAPL': aapl_df, 'MSFT': msft_df} >>> result = ac.batch_store(data, mode='write') >>> >>> # Read data >>> result = ac.batch_read(['AAPL', 'MSFT']) >>> df = result['payload']

AWS S3 backend: >>> # Requires AWS CLI configuration and boto3 >>> # export AWS_PROFILE=my-profile (if using named profiles) >>> ac = ArcDB( ... library_name='my_data', ... bucket_name='my-timeseries-bucket' ... )

Advanced API access: >>> ac = ArcDB(library_name='my_data') >>> # List all symbols >>> symbols = ac._lib.list_symbols() >>> # Get symbol metadata >>> metadata = ac._lib.get_info('AAPL') >>> # Direct read with query >>> df = ac._lib.read('AAPL', date_range=(start_date, end_date)).data

Initialize ArcticDB connection with specified backend.

Establishes connection to ArcticDB using configuration from ~/.chronos_lab/.env or provided parameters. Automatically creates library if it doesn't exist.

Parameters:

Name Type Description Default
bucket_name

AWS S3 bucket name for S3 backend. If None, uses ARCTICDB_S3_BUCKET from configuration. Takes precedence over local_path.

None
local_path

Local filesystem path for LMDB backend. If None, uses ARCTICDB_LOCAL_PATH from configuration. Ignored if bucket_name is set.

None
library_name

Name of the ArcticDB library to use or create.

required

Raises:

Type Description
Exception

If connection initialization fails (logged and re-raised).

Note
  • Backend priority: S3 > Local LMDB > In-memory
  • S3 backend requires boto3 and AWS CLI configuration
  • Local path is created automatically if it doesn't exist
  • In-memory backend used if neither S3 nor local path configured

batch_store

batch_store(data_dict, mode='append', **kwargs)

Store multiple symbols in batch with write or append mode.

Writes or appends DataFrames for multiple symbols in a single batch operation. Each symbol is stored as a separate versioned entity in ArcticDB.

Parameters:

Name Type Description Default
data_dict

Dictionary mapping symbol names (str) to pandas DataFrames. Each DataFrame should have a DatetimeIndex.

required
mode

Storage mode, either 'append' (default) or 'write'. - 'append': Add new rows to existing data - 'write': Overwrite existing data completely

'append'
**kwargs

Additional keyword arguments passed to ArcticDB write/append operations (e.g., prune_previous_versions=True).

{}

Returns:

Type Description

Dictionary with status information: - 'statusCode': 0 on complete success, 1 if some symbols failed, -1 on complete failure - 'skipped_symbols': List of symbols that failed to store

Examples:

Write mode (overwrite): >>> ac = ArcDB(library_name='yfinance') >>> data = { ... 'AAPL': aapl_df, ... 'MSFT': msft_df, ... 'GOOGL': googl_df ... } >>> result = ac.batch_store( ... data, ... mode='write', ... prune_previous_versions=True ... ) >>> print(f"Status: {result['statusCode']}")

Append mode (add new data): >>> new_data = {'AAPL': new_aapl_df, 'MSFT': new_msft_df} >>> result = ac.batch_store(new_data, mode='append') >>> if result['skipped_symbols']: ... print(f"Failed: {result['skipped_symbols']}")

batch_read

batch_read(symbol_list, qb_join='inner', **kwargs)

Read and join multiple symbols in batch operation.

Reads multiple symbols and joins them into a single DataFrame using ArcticDB's batch read and join functionality. Supports date range filtering and column selection via kwargs.

Parameters:

Name Type Description Default
symbol_list

List of symbol names (str) to read.

required
qb_join

Join strategy for combining symbols, either 'inner' (default) or 'outer'. Inner join includes only dates present in all symbols; outer join includes all dates with NaN for missing values.

'inner'
**kwargs

Additional keyword arguments passed to ArcticDB ReadRequest: - date_range: Tuple of (start_date, end_date) for filtering - columns: List of column names to retrieve

{}

Returns:

Type Description

Dictionary with read results: - 'statusCode': 0 on success, -1 on failure - 'payload': Combined DataFrame with all symbols and a 'symbol' column, or None on error

Examples:

Basic batch read: >>> ac = ArcDB(library_name='yfinance') >>> result = ac.batch_read(['AAPL', 'MSFT', 'GOOGL']) >>> if result['statusCode'] == 0: ... df = result['payload'] ... print(df.head())

Read with date range: >>> from datetime import datetime >>> result = ac.batch_read( ... symbol_list=['AAPL', 'MSFT'], ... qb_join='outer', ... date_range=( ... datetime(2024, 1, 1), ... datetime(2024, 12, 31) ... ) ... ) >>> df = result['payload']

Read specific columns: >>> result = ac.batch_read( ... symbol_list=['AAPL', 'MSFT', 'GOOGL'], ... columns=['close', 'volume'] ... )

Note
  • Returns concatenated DataFrame with 'symbol' column for identification
  • Inner join is more restrictive; use outer join for comprehensive data
  • Date range filtering is applied before join operation

batch_update

batch_update(data_dict, **kwargs)

Update existing symbols in batch using concurrent operations.

Updates multiple existing symbols concurrently using ThreadPoolExecutor for improved performance. Use this for modifying existing data ranges.

Parameters:

Name Type Description Default
data_dict

Dictionary mapping symbol names (str) to pandas DataFrames. Symbols must already exist in the library.

required
**kwargs

Additional keyword arguments passed to ArcticDB update operation.

{}

Returns:

Type Description

Dictionary with status information: - 'statusCode': 0 on complete success, 1 if some symbols failed, -1 on complete failure - 'skipped_symbols': List of symbols that failed to update

Examples:

Update existing data: >>> ac = ArcDB(library_name='yfinance') >>> updated_data = { ... 'AAPL': corrected_aapl_df, ... 'MSFT': corrected_msft_df ... } >>> result = ac.batch_update(updated_data) >>> if result['statusCode'] == 0: ... print("All symbols updated successfully") >>> else: ... print(f"Failed symbols: {result['skipped_symbols']}")

Note
  • Symbols must exist in the library before updating
  • Updates are performed concurrently using ThreadPoolExecutor
  • For appending new data, use batch_store(mode='append') instead