ArcticDB Wrapper API¶
Low-level wrapper for direct ArcticDB operations.
Advanced API
This is a low-level API intended for advanced use cases. For most scenarios, use the high-level functions:
ohlcv_from_arcticdb()for reading dataohlcv_to_arcticdb()for writing data
Overview¶
The chronos_lab.arcdb module provides the ArcDB class for direct control over ArcticDB operations, custom batch processing, and access to underlying ArcticDB APIs.
When to Use This API¶
- Custom batch processing workflows
- Fine-grained control over versioning
- Direct access to ArcticDB Library API
- Advanced query operations
When NOT to Use This API¶
- Standard data fetching (use
ohlcv_from_arcticdb()) - Standard data storage (use
ohlcv_to_arcticdb()) - Simple read/write operations
Class¶
chronos_lab.arcdb.ArcDB ¶
Low-level wrapper for ArcticDB time series database operations.
Provides batch storage and retrieval operations with support for multiple storage backends (local LMDB, AWS S3, in-memory). Manages connection lifecycle and provides access to underlying ArcticDB API objects for advanced operations.
NOTE: This is a low-level class. For typical use cases, prefer high-level functions in chronos_lab.sources and chronos_lab.storage modules.
Attributes:
| Name | Type | Description |
|---|---|---|
_ac |
Arctic instance (connection to storage backend). For advanced operations, see https://docs.arcticdb.io/dev/api/arctic/ |
|
_lib |
Library instance (specific library within Arctic). For advanced operations, see https://docs.arcticdb.io/dev/api/library/ |
|
_bucket_name |
S3 bucket name (if using S3 backend) |
|
_local_path |
Local filesystem path (if using LMDB backend) |
|
_library_name |
Name of the ArcticDB library |
Examples:
Basic usage with local storage: >>> from chronos_lab.arcdb import ArcDB >>> import pandas as pd >>> >>> # Initialize connection >>> ac = ArcDB(library_name='my_data') >>> >>> # Store data >>> data = {'AAPL': aapl_df, 'MSFT': msft_df} >>> result = ac.batch_store(data, mode='write') >>> >>> # Read data >>> result = ac.batch_read(['AAPL', 'MSFT']) >>> df = result['payload']
AWS S3 backend: >>> # Requires AWS CLI configuration and boto3 >>> # export AWS_PROFILE=my-profile (if using named profiles) >>> ac = ArcDB( ... library_name='my_data', ... bucket_name='my-timeseries-bucket' ... )
Advanced API access: >>> ac = ArcDB(library_name='my_data') >>> # List all symbols >>> symbols = ac._lib.list_symbols() >>> # Get symbol metadata >>> metadata = ac._lib.get_info('AAPL') >>> # Direct read with query >>> df = ac._lib.read('AAPL', date_range=(start_date, end_date)).data
Initialize ArcticDB connection with specified backend.
Establishes connection to ArcticDB using configuration from ~/.chronos_lab/.env or provided parameters. Automatically creates library if it doesn't exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bucket_name
|
AWS S3 bucket name for S3 backend. If None, uses ARCTICDB_S3_BUCKET from configuration. Takes precedence over local_path. |
None
|
|
local_path
|
Local filesystem path for LMDB backend. If None, uses ARCTICDB_LOCAL_PATH from configuration. Ignored if bucket_name is set. |
None
|
|
library_name
|
Name of the ArcticDB library to use or create. |
required |
Raises:
| Type | Description |
|---|---|
Exception
|
If connection initialization fails (logged and re-raised). |
Note
- Backend priority: S3 > Local LMDB > In-memory
- S3 backend requires boto3 and AWS CLI configuration
- Local path is created automatically if it doesn't exist
- In-memory backend used if neither S3 nor local path configured
batch_store ¶
Store multiple symbols in batch with write or append mode.
Writes or appends DataFrames for multiple symbols in a single batch operation. Each symbol is stored as a separate versioned entity in ArcticDB.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dict
|
Dictionary mapping symbol names (str) to pandas DataFrames. Each DataFrame should have a DatetimeIndex. |
required | |
mode
|
Storage mode, either 'append' (default) or 'write'. - 'append': Add new rows to existing data - 'write': Overwrite existing data completely |
'append'
|
|
**kwargs
|
Additional keyword arguments passed to ArcticDB write/append operations (e.g., prune_previous_versions=True). |
{}
|
Returns:
| Type | Description |
|---|---|
|
Dictionary with status information: - 'statusCode': 0 on complete success, 1 if some symbols failed, -1 on complete failure - 'skipped_symbols': List of symbols that failed to store |
Examples:
Write mode (overwrite): >>> ac = ArcDB(library_name='yfinance') >>> data = { ... 'AAPL': aapl_df, ... 'MSFT': msft_df, ... 'GOOGL': googl_df ... } >>> result = ac.batch_store( ... data, ... mode='write', ... prune_previous_versions=True ... ) >>> print(f"Status: {result['statusCode']}")
Append mode (add new data): >>> new_data = {'AAPL': new_aapl_df, 'MSFT': new_msft_df} >>> result = ac.batch_store(new_data, mode='append') >>> if result['skipped_symbols']: ... print(f"Failed: {result['skipped_symbols']}")
batch_read ¶
Read and join multiple symbols in batch operation.
Reads multiple symbols and joins them into a single DataFrame using ArcticDB's batch read and join functionality. Supports date range filtering and column selection via kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol_list
|
List of symbol names (str) to read. |
required | |
qb_join
|
Join strategy for combining symbols, either 'inner' (default) or 'outer'. Inner join includes only dates present in all symbols; outer join includes all dates with NaN for missing values. |
'inner'
|
|
**kwargs
|
Additional keyword arguments passed to ArcticDB ReadRequest: - date_range: Tuple of (start_date, end_date) for filtering - columns: List of column names to retrieve |
{}
|
Returns:
| Type | Description |
|---|---|
|
Dictionary with read results: - 'statusCode': 0 on success, -1 on failure - 'payload': Combined DataFrame with all symbols and a 'symbol' column, or None on error |
Examples:
Basic batch read: >>> ac = ArcDB(library_name='yfinance') >>> result = ac.batch_read(['AAPL', 'MSFT', 'GOOGL']) >>> if result['statusCode'] == 0: ... df = result['payload'] ... print(df.head())
Read with date range: >>> from datetime import datetime >>> result = ac.batch_read( ... symbol_list=['AAPL', 'MSFT'], ... qb_join='outer', ... date_range=( ... datetime(2024, 1, 1), ... datetime(2024, 12, 31) ... ) ... ) >>> df = result['payload']
Read specific columns: >>> result = ac.batch_read( ... symbol_list=['AAPL', 'MSFT', 'GOOGL'], ... columns=['close', 'volume'] ... )
Note
- Returns concatenated DataFrame with 'symbol' column for identification
- Inner join is more restrictive; use outer join for comprehensive data
- Date range filtering is applied before join operation
batch_update ¶
Update existing symbols in batch using concurrent operations.
Updates multiple existing symbols concurrently using ThreadPoolExecutor for improved performance. Use this for modifying existing data ranges.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dict
|
Dictionary mapping symbol names (str) to pandas DataFrames. Symbols must already exist in the library. |
required | |
**kwargs
|
Additional keyword arguments passed to ArcticDB update operation. |
{}
|
Returns:
| Type | Description |
|---|---|
|
Dictionary with status information: - 'statusCode': 0 on complete success, 1 if some symbols failed, -1 on complete failure - 'skipped_symbols': List of symbols that failed to update |
Examples:
Update existing data: >>> ac = ArcDB(library_name='yfinance') >>> updated_data = { ... 'AAPL': corrected_aapl_df, ... 'MSFT': corrected_msft_df ... } >>> result = ac.batch_update(updated_data) >>> if result['statusCode'] == 0: ... print("All symbols updated successfully") >>> else: ... print(f"Failed symbols: {result['skipped_symbols']}")
Note
- Symbols must exist in the library before updating
- Updates are performed concurrently using ThreadPoolExecutor
- For appending new data, use batch_store(mode='append') instead