Data Sources API¶
High-level functions for fetching OHLCV time series data from multiple sources.
Overview¶
The chronos_lab.sources module provides unified interfaces for fetching data from:
- Yahoo Finance: Free market data via yfinance
- Intrinio: Institutional-quality financial data (requires API subscription)
- ArcticDB: Retrieve previously stored time series data
All functions return data in consistent formats with UTC timezone-aware timestamps.
Functions¶
chronos_lab.sources.ohlcv_from_yfinance ¶
ohlcv_from_yfinance(*, symbols: List[str], period: Optional[str] = None, start_date: Optional[str | datetime] = None, end_date: Optional[str | datetime] = None, interval: Optional[str] = '1d', output_dict: Optional[bool] = False, **kwargs) -> Dict[str, pd.DataFrame] | pd.DataFrame | None
Download OHLCV data from Yahoo Finance.
Fetches historical or intraday price data for multiple symbols using the yfinance library. Data is returned with UTC timezone-aware timestamps in a consistent format suitable for analysis or storage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
List[str]
|
List of ticker symbols to download (max 100 symbols per call). |
required |
period
|
Optional[str]
|
Relative time period (e.g., '1d', '5d', '1mo', '3mo', '1y', 'max'). Mutually exclusive with start_date/end_date. |
None
|
start_date
|
Optional[str | datetime]
|
Start date as 'YYYY-MM-DD' string or datetime object (inclusive). Mutually exclusive with period. |
None
|
end_date
|
Optional[str | datetime]
|
End date as 'YYYY-MM-DD' string or datetime object (exclusive). Defaults to current time if start_date is provided without end_date. |
None
|
interval
|
Optional[str]
|
Data frequency interval. Options include: - Intraday: '1m', '2m', '5m', '15m', '30m', '60m', '90m', '1h' - Daily+: '1d', '5d', '1wk', '1mo', '3mo' Defaults to '1d' (daily). |
'1d'
|
output_dict
|
Optional[bool]
|
If True, returns dict mapping symbols to DataFrames. If False, returns single MultiIndex DataFrame with (date, symbol) levels. Defaults to False. |
False
|
**kwargs
|
Additional keyword arguments passed to yfinance.download(). |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, DataFrame] | DataFrame | None
|
If output_dict=True: Dictionary mapping symbol strings to DataFrames, where each DataFrame has DatetimeIndex and columns ['open', 'high', 'low', 'close', 'volume', 'symbol', 'interval' (intraday only)]. |
Dict[str, DataFrame] | DataFrame | None
|
If output_dict=False: Single DataFrame with MultiIndex (date, symbol) and same columns. |
Dict[str, DataFrame] | DataFrame | None
|
Returns None if no data could be retrieved or on error. |
Raises:
| Type | Description |
|---|---|
None
|
Errors are logged but not raised. Check return value for None. |
Examples:
Basic daily data fetch: >>> prices = ohlcv_from_yfinance( ... symbols=['AAPL', 'MSFT', 'GOOGL'], ... period='1y' ... ) >>> print(prices.head()) >>> # Returns MultiIndex DataFrame with (date, symbol) levels
Fetch specific date range: >>> prices = ohlcv_from_yfinance( ... symbols=['AAPL', 'MSFT'], ... start_date='2024-01-01', ... end_date='2024-12-31', ... interval='1d' ... )
Get 5-minute intraday bars: >>> intraday = ohlcv_from_yfinance( ... symbols=['SPY', 'QQQ'], ... period='1d', ... interval='5m' ... ) >>> # Includes 'interval' column for intraday data
Get data as dictionary by symbol: >>> prices_dict = ohlcv_from_yfinance( ... symbols=['AAPL', 'MSFT'], ... period='6mo', ... output_dict=True ... ) >>> aapl_df = prices_dict['AAPL'] >>> # Work with individual symbol DataFrames
Note
- Yahoo Finance has rate limits; avoid excessive requests
- Intraday data availability is limited (typically last 7-60 days depending on interval)
- Max 100 symbols per call to avoid timeout issues
- All timestamps are converted to UTC timezone
chronos_lab.sources.ohlcv_from_intrinio ¶
ohlcv_from_intrinio(*, symbols: List[str], period: Optional[str] = None, start_date: Optional[str | datetime] = None, end_date: Optional[str | datetime] = None, interval: Optional[Literal['1m', '5m', '10m', '15m', '30m', '60m', '1h', 'daily', 'weekly', 'monthly', 'quarterly', 'yearly']] = 'daily', api_key: Optional[str] = None, output_dict: Optional[bool] = False, **kwargs) -> Dict[str, pd.DataFrame] | pd.DataFrame | None
Download OHLCV data from Intrinio API.
Fetches institutional-quality historical or intraday price data using the Intrinio financial data platform. Requires an active Intrinio API subscription. Data is returned with UTC timezone-aware timestamps in a consistent format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
List[str]
|
List of security identifiers (ticker symbols, CUSIPs, or Intrinio IDs). |
required |
period
|
Optional[str]
|
Relative time period (e.g., '1d', '5d', '1mo', '1y'). Mutually exclusive with start_date/end_date. |
None
|
start_date
|
Optional[str | datetime]
|
Start date as 'YYYY-MM-DD' string or datetime object (inclusive). Mutually exclusive with period. |
None
|
end_date
|
Optional[str | datetime]
|
End date as 'YYYY-MM-DD' string or datetime object (exclusive). Defaults to current time if start_date is provided without end_date. |
None
|
interval
|
Optional[Literal['1m', '5m', '10m', '15m', '30m', '60m', '1h', 'daily', 'weekly', 'monthly', 'quarterly', 'yearly']]
|
Data frequency interval. Options: - Intraday: '1m', '5m', '10m', '15m', '30m', '60m', '1h' - Historical: 'daily', 'weekly', 'monthly', 'quarterly', 'yearly' Defaults to 'daily'. |
'daily'
|
api_key
|
Optional[str]
|
Intrinio API key. If None, reads from INTRINIO_API_KEY in ~/.chronos_lab/.env configuration file. |
None
|
output_dict
|
Optional[bool]
|
If True, returns dict mapping symbols to DataFrames. If False, returns single MultiIndex DataFrame with (date, id) levels. Defaults to False. |
False
|
**kwargs
|
Additional keyword arguments passed to Intrinio SDK (e.g., frequency, sort_order). |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, DataFrame] | DataFrame | None
|
If output_dict=True: Dictionary mapping ticker symbols to DataFrames, where each DataFrame has DatetimeIndex and columns ['id', 'open', 'high', 'low', 'close', 'volume', 'interval' (intraday only), 'symbol' (if ticker differs from id)]. |
Dict[str, DataFrame] | DataFrame | None
|
If output_dict=False: Single DataFrame with MultiIndex (date, id) and same columns. |
Dict[str, DataFrame] | DataFrame | None
|
Returns None if no data could be retrieved or on error. |
Raises:
| Type | Description |
|---|---|
None
|
Errors are logged but not raised. Check return value for None. |
Examples:
Fetch daily data with API key: >>> prices = ohlcv_from_intrinio( ... symbols=['AAPL', 'MSFT'], ... start_date='2024-01-01', ... interval='daily', ... api_key='your_api_key_here' ... ) >>> # Returns MultiIndex DataFrame with (date, id) levels
Fetch data using configuration file: >>> # First set INTRINIO_API_KEY in ~/.chronos_lab/.env >>> prices = ohlcv_from_intrinio( ... symbols=['AAPL', 'MSFT'], ... period='1y', ... interval='daily' ... )
Get intraday 5-minute bars: >>> intraday = ohlcv_from_intrinio( ... symbols=['SPY'], ... start_date='2024-01-15', ... end_date='2024-01-16', ... interval='5m' ... ) >>> # Includes 'interval' column for intraday data
Get data as dictionary by symbol: >>> prices_dict = ohlcv_from_intrinio( ... symbols=['AAPL', 'MSFT', 'GOOGL'], ... period='3mo', ... interval='daily', ... output_dict=True ... ) >>> aapl_df = prices_dict['AAPL']
Note
- Requires active Intrinio subscription with appropriate data access
- API rate limits apply based on subscription tier
- Intraday data availability depends on subscription level
- All timestamps are converted to UTC timezone
- Symbol identifiers can be tickers, CUSIPs, or Intrinio composite IDs
chronos_lab.sources.ohlcv_from_arcticdb ¶
ohlcv_from_arcticdb(symbols: List[str], start_date: Optional[Union[str, Timestamp]] = None, end_date: Optional[Union[str, Timestamp]] = None, period: Optional[str] = None, columns: Optional[List[str]] = None, library_name: Optional[str] = None, pivot: bool = False, group_by: Optional[Literal['column', 'symbol']] = 'column') -> Optional[pd.DataFrame]
Retrieve historical or intraday OHLCV data from ArcticDB storage.
Queries previously stored time series data from ArcticDB with flexible date filtering and output formatting options. Supports both long format (MultiIndex) and wide format (pivoted) for different analysis workflows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
List[str]
|
List of ticker symbols to retrieve (e.g., ['AAPL', 'MSFT', 'GOOGL']). |
required |
start_date
|
Optional[Union[str, Timestamp]]
|
Start date as 'YYYY-MM-DD' string or pd.Timestamp (inclusive). Mutually exclusive with period. |
None
|
end_date
|
Optional[Union[str, Timestamp]]
|
End date as 'YYYY-MM-DD' string or pd.Timestamp (exclusive). Defaults to current UTC time if not specified. |
None
|
period
|
Optional[str]
|
Relative time period (e.g., '5d', '3m', '1y'). Mutually exclusive with start_date/end_date. |
None
|
columns
|
Optional[List[str]]
|
List of specific columns to retrieve (e.g., ['close', 'volume']). The 'symbol' column is always included automatically. If None, all columns are retrieved. |
None
|
library_name
|
Optional[str]
|
ArcticDB library name to query. If None, uses ARCTICDB_DEFAULT_LIBRARY_NAME from ~/.chronos_lab/.env configuration. |
None
|
pivot
|
bool
|
If True, reshape to wide format with symbols as columns. If False (default), return long format with MultiIndex (date, symbol). |
False
|
group_by
|
Optional[Literal['column', 'symbol']]
|
When pivot=True, controls column ordering in MultiIndex: - 'column' (default): Creates (column, symbol) ordering (e.g., close_AAPL, close_MSFT) - 'symbol': Creates (symbol, column) ordering (e.g., AAPL_close, AAPL_high) |
'column'
|
Returns:
| Type | Description |
|---|---|
Optional[DataFrame]
|
If pivot=False: DataFrame with MultiIndex (date, symbol) and columns ['open', 'high', 'low', 'close', 'volume', ...]. |
Optional[DataFrame]
|
If pivot=True: DataFrame with DatetimeIndex and MultiIndex columns organized by group_by parameter. |
Optional[DataFrame]
|
Returns None if no data found, invalid parameters, or on error. |
Raises:
| Type | Description |
|---|---|
None
|
Errors are logged but not raised. Check return value for None. |
Examples:
Basic retrieval with relative period: >>> from chronos_lab.sources import ohlcv_from_arcticdb >>> >>> prices = ohlcv_from_arcticdb( ... symbols=['AAPL', 'MSFT', 'GOOGL'], ... period='3m', ... library_name='yfinance' ... ) >>> print(prices.head()) >>> # Returns MultiIndex (date, symbol) DataFrame
Specify exact date range: >>> prices = ohlcv_from_arcticdb( ... symbols=['AAPL', 'MSFT'], ... start_date='2024-01-01', ... end_date='2024-12-31', ... library_name='yfinance' ... )
Retrieve only specific columns: >>> closes = ohlcv_from_arcticdb( ... symbols=['AAPL', 'MSFT', 'GOOGL'], ... period='1y', ... columns=['close'], ... library_name='yfinance' ... ) >>> # Returns only 'close' and 'symbol' columns
Pivot to wide format for correlation analysis: >>> wide_prices = ohlcv_from_arcticdb( ... symbols=['AAPL', 'MSFT', 'GOOGL', 'AMZN'], ... period='1y', ... columns=['close'], ... library_name='yfinance', ... pivot=True, ... group_by='column' ... ) >>> # Creates columns: close_AAPL, close_MSFT, etc. >>> returns = wide_prices.pct_change() >>> correlation_matrix = returns.corr()
Alternative pivot grouping by symbol: >>> wide_prices = ohlcv_from_arcticdb( ... symbols=['AAPL', 'MSFT'], ... period='6mo', ... pivot=True, ... group_by='symbol' ... ) >>> # Creates MultiIndex: (AAPL, close), (AAPL, high), (MSFT, close), etc.
Note
- Period strings: '7d' (days), '4w' (weeks), '3mo'/'3m' (months), '1y' (years)
- All timestamps are UTC timezone-aware
- Data must have been previously stored using ohlcv_to_arcticdb()
- Empty result returns None with warning logged
chronos_lab.sources.securities_from_intrinio ¶
securities_from_intrinio(*, api_key: Optional[str] = None, composite_mic: str = 'USCOMP', codes: List[Literal['EQS', 'ETF', 'DR', 'PRF', 'WAR', 'RTS', 'UNT', 'CEF', 'ETN', 'ETC']] = ['EQS']) -> pd.DataFrame | None
Retrieve securities list from Intrinio API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
Optional[str]
|
Intrinio API key. If None, uses default from Intrinio class. |
None
|
composite_mic
|
str
|
Composite MIC code for the exchange. Defaults to 'USCOMP'. |
'USCOMP'
|
codes
|
List[Literal['EQS', 'ETF', 'DR', 'PRF', 'WAR', 'RTS', 'UNT', 'CEF', 'ETN', 'ETC']]
|
List of security type codes. Common codes include: - 'EQS': Equity Shares (common stocks) - 'ETF': Exchange Traded Funds - 'DR': Depository Receipts (ADRs, GDRs, etc.) - 'PRF': Preference Shares (preferred stock) - 'WAR': Warrants - 'RTS': Rights - 'UNT': Units - 'CEF': Closed-Ended Funds - 'ETN': Exchange Traded Notes - 'ETC': Exchange Traded Commodities Defaults to ['EQS'] if None. |
['EQS']
|
Returns:
| Type | Description |
|---|---|
DataFrame | None
|
DataFrame with securities indexed by 'id', or None on error. |