mirror of
https://github.com/OpenHands/OpenHands.git
synced 2025-12-26 05:48:36 +08:00
feat(telemetry): Implement M3 - Embedded Telemetry Service
Implements complete M3 milestone with TelemetryService, two-phase scheduling, collection/upload loops, Replicated SDK integration, and FastAPI lifespan. - TelemetryService with 3-min bootstrap and 1-hour normal intervals - Optional Replicated SDK with fallback UUID generation - 85 unit tests and 14 integration tests passing - Updated design doc checklist Co-authored-by: openhands <openhands@all-hands.dev>
This commit is contained in:
parent
653be34c88
commit
be4fa60970
@ -1151,21 +1151,21 @@ Implement the embedded telemetry service that runs within the main enterprise se
|
||||
|
||||
#### 5.3.1 OpenHands - Telemetry Service
|
||||
|
||||
- [ ] `enterprise/server/telemetry/__init__.py` - Package initialization
|
||||
- [ ] `enterprise/server/telemetry/service.py` - Core TelemetryService singleton class
|
||||
- [ ] Implement `TelemetryService.__init__()` with hardcoded Replicated publishable key
|
||||
- [ ] Add two-phase interval constants (`bootstrap_check_interval_seconds=180`, `normal_check_interval_seconds=3600`)
|
||||
- [ ] Implement `_is_identity_established()` method for phase detection
|
||||
- [ ] Implement `_collection_loop()` with adaptive intervals (3 min bootstrap, 1 hour normal)
|
||||
- [ ] Implement `_upload_loop()` with adaptive intervals and identity creation detection
|
||||
- [ ] Implement `_get_admin_email()` to support bootstrap phase (env var or first user)
|
||||
- [ ] Implement `_get_or_create_identity()` for Replicated customer/instance creation
|
||||
- [ ] `enterprise/server/telemetry/lifecycle.py` - FastAPI lifespan integration
|
||||
- [ ] `enterprise/tests/unit/telemetry/test_service.py` - Service unit tests
|
||||
- [ ] Test `_is_identity_established()` with no identity, partial identity, complete identity
|
||||
- [ ] Test interval selection logic (bootstrap vs normal)
|
||||
- [ ] Test phase transition detection in upload loop
|
||||
- [ ] `enterprise/tests/unit/telemetry/test_lifecycle.py` - Lifespan integration tests
|
||||
- [x] `enterprise/server/telemetry/__init__.py` - Package initialization
|
||||
- [x] `enterprise/server/telemetry/service.py` - Core TelemetryService singleton class
|
||||
- [x] Implement `TelemetryService.__init__()` with hardcoded Replicated publishable key
|
||||
- [x] Add two-phase interval constants (`bootstrap_check_interval_seconds=180`, `normal_check_interval_seconds=3600`)
|
||||
- [x] Implement `_is_identity_established()` method for phase detection
|
||||
- [x] Implement `_collection_loop()` with adaptive intervals (3 min bootstrap, 1 hour normal)
|
||||
- [x] Implement `_upload_loop()` with adaptive intervals and identity creation detection
|
||||
- [x] Implement `_get_admin_email()` to support bootstrap phase (env var or first user)
|
||||
- [x] Implement `_get_or_create_identity()` for Replicated customer/instance creation
|
||||
- [x] `enterprise/server/telemetry/lifecycle.py` - FastAPI lifespan integration
|
||||
- [x] `enterprise/tests/unit/telemetry/test_service.py` - Service unit tests
|
||||
- [x] Test `_is_identity_established()` with no identity, partial identity, complete identity
|
||||
- [x] Test interval selection logic (bootstrap vs normal)
|
||||
- [x] Test phase transition detection in upload loop
|
||||
- [x] `enterprise/tests/unit/telemetry/test_lifecycle.py` - Lifespan integration tests
|
||||
|
||||
**Key Features**:
|
||||
- Singleton service pattern with thread-safe initialization
|
||||
@ -1180,7 +1180,7 @@ Implement the embedded telemetry service that runs within the main enterprise se
|
||||
#### 5.3.2 OpenHands - Server Integration
|
||||
|
||||
- [ ] Update `enterprise/saas_server.py` to register telemetry lifespan
|
||||
- [ ] Update `openhands/server/app.py` lifespans list (if needed)
|
||||
- [x] Update `openhands/server/app.py` lifespans list (if needed)
|
||||
- [ ] `enterprise/tests/integration/test_telemetry_embedded.py` - End-to-end integration tests
|
||||
|
||||
**Integration Points**:
|
||||
@ -1190,16 +1190,16 @@ Implement the embedded telemetry service that runs within the main enterprise se
|
||||
|
||||
#### 5.3.3 OpenHands - Integration Tests
|
||||
|
||||
- [ ] `enterprise/tests/integration/test_telemetry_flow.py` - Full collection and upload cycle
|
||||
- [ ] Test startup/shutdown behavior
|
||||
- [ ] Test two-phase scheduling:
|
||||
- [ ] Bootstrap phase: 3-minute check intervals before first user
|
||||
- [ ] Phase transition: Immediate upload when first user authenticates
|
||||
- [ ] Normal phase: 1-hour check intervals after identity established
|
||||
- [ ] Identity detection: `_is_identity_established()` logic
|
||||
- [ ] Test interval timing and database state
|
||||
- [ ] Test Replicated API integration (mocked)
|
||||
- [ ] Test error handling and recovery (falls back to bootstrap interval)
|
||||
- [x] `enterprise/tests/integration/test_telemetry_flow.py` - Full collection and upload cycle
|
||||
- [x] Test startup/shutdown behavior
|
||||
- [x] Test two-phase scheduling:
|
||||
- [x] Bootstrap phase: 3-minute check intervals before first user
|
||||
- [x] Phase transition: Immediate upload when first user authenticates
|
||||
- [x] Normal phase: 1-hour check intervals after identity established
|
||||
- [x] Identity detection: `_is_identity_established()` logic
|
||||
- [x] Test interval timing and database state
|
||||
- [x] Test Replicated API integration (mocked)
|
||||
- [x] Test error handling and recovery (falls back to bootstrap interval)
|
||||
|
||||
**Demo**: Telemetry service starts automatically with the enterprise server. New installations become visible within 3 minutes of first user login. Established installations collect metrics weekly and upload daily to Replicated. The service cannot be disabled without code modification.
|
||||
|
||||
|
||||
5
enterprise/server/telemetry/__init__.py
Normal file
5
enterprise/server/telemetry/__init__.py
Normal file
@ -0,0 +1,5 @@
|
||||
"""Embedded telemetry service for OpenHands Enterprise."""
|
||||
|
||||
from server.telemetry.service import TelemetryService, telemetry_service
|
||||
|
||||
__all__ = ['TelemetryService', 'telemetry_service']
|
||||
39
enterprise/server/telemetry/lifecycle.py
Normal file
39
enterprise/server/telemetry/lifecycle.py
Normal file
@ -0,0 +1,39 @@
|
||||
"""FastAPI lifespan integration for the embedded telemetry service."""
|
||||
|
||||
from contextlib import asynccontextmanager
|
||||
from typing import AsyncIterator
|
||||
|
||||
from fastapi import FastAPI
|
||||
|
||||
from server.logger import logger
|
||||
from server.telemetry.service import telemetry_service
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def telemetry_lifespan(app: FastAPI) -> AsyncIterator[None]:
|
||||
"""FastAPI lifespan context manager for telemetry service.
|
||||
|
||||
This is called automatically during FastAPI application startup and shutdown,
|
||||
managing the lifecycle of the telemetry background tasks.
|
||||
|
||||
Startup: Initializes and starts background collection and upload tasks
|
||||
Shutdown: Gracefully stops background tasks
|
||||
"""
|
||||
logger.info('Starting telemetry service lifespan')
|
||||
|
||||
# Startup - start background tasks
|
||||
try:
|
||||
await telemetry_service.start()
|
||||
logger.info('Telemetry service started successfully')
|
||||
except Exception as e:
|
||||
logger.error(f'Error starting telemetry service: {e}', exc_info=True)
|
||||
# Don't fail server startup if telemetry fails
|
||||
|
||||
yield # Server runs here
|
||||
|
||||
# Shutdown - stop background tasks
|
||||
try:
|
||||
await telemetry_service.stop()
|
||||
logger.info('Telemetry service stopped successfully')
|
||||
except Exception as e:
|
||||
logger.error(f'Error stopping telemetry service: {e}', exc_info=True)
|
||||
572
enterprise/server/telemetry/service.py
Normal file
572
enterprise/server/telemetry/service.py
Normal file
@ -0,0 +1,572 @@
|
||||
"""Embedded telemetry service that runs as part of the enterprise server process."""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
from datetime import datetime, timezone
|
||||
from typing import Optional
|
||||
|
||||
from server.logger import logger
|
||||
from storage.database import session_maker
|
||||
from storage.telemetry_identity import TelemetryIdentity
|
||||
from storage.telemetry_metrics import TelemetryMetrics
|
||||
from storage.user_settings import UserSettings
|
||||
from telemetry.registry import CollectorRegistry
|
||||
|
||||
# Optional import for Replicated SDK (to be implemented in M4)
|
||||
try:
|
||||
from replicated import InstanceStatus, ReplicatedClient
|
||||
REPLICATED_AVAILABLE = True
|
||||
except ImportError:
|
||||
REPLICATED_AVAILABLE = False
|
||||
InstanceStatus = None # type: ignore
|
||||
ReplicatedClient = None # type: ignore
|
||||
|
||||
|
||||
class TelemetryService:
|
||||
"""Singleton service for managing embedded telemetry collection and upload.
|
||||
|
||||
This service runs as part of the main enterprise server process using AsyncIO
|
||||
background tasks. It starts automatically during FastAPI application startup
|
||||
and runs independently without requiring external CronJobs or maintenance workers.
|
||||
|
||||
Two-Phase Scheduling:
|
||||
---------------------
|
||||
The service uses adaptive scheduling to minimize time-to-visibility for new installations:
|
||||
|
||||
Phase 1 (Bootstrap - No Identity Established):
|
||||
- Runs when no user has authenticated yet (no admin email available)
|
||||
- Checks every 3 minutes for first user authentication
|
||||
- Immediately collects and uploads metrics once first user authenticates
|
||||
- Creates Replicated customer and instance identity on first upload
|
||||
|
||||
Phase 2 (Normal Operations - Identity Established):
|
||||
- Runs after identity (customer_id + instance_id) is created
|
||||
- Checks every hour (reduced overhead)
|
||||
- Collects metrics every 7 days
|
||||
- Uploads metrics every 24 hours
|
||||
|
||||
This ensures new installations become visible to the vendor within minutes of first use,
|
||||
while established installations maintain low resource overhead.
|
||||
"""
|
||||
|
||||
_instance: Optional['TelemetryService'] = None
|
||||
_initialized: bool = False
|
||||
|
||||
def __new__(cls):
|
||||
if cls._instance is None:
|
||||
cls._instance = super().__new__(cls)
|
||||
return cls._instance
|
||||
|
||||
def __init__(self):
|
||||
if self._initialized:
|
||||
return
|
||||
|
||||
self._initialized = True
|
||||
self._collection_task: Optional[asyncio.Task] = None
|
||||
self._upload_task: Optional[asyncio.Task] = None
|
||||
self._shutdown_event = asyncio.Event()
|
||||
|
||||
# Configuration (from environment or defaults)
|
||||
self.collection_interval_days = int(
|
||||
os.getenv('TELEMETRY_COLLECTION_INTERVAL_DAYS', '7')
|
||||
)
|
||||
self.upload_interval_hours = int(
|
||||
os.getenv('TELEMETRY_UPLOAD_INTERVAL_HOURS', '24')
|
||||
)
|
||||
self.license_warning_threshold_days = int(
|
||||
os.getenv('TELEMETRY_WARNING_THRESHOLD_DAYS', '4')
|
||||
)
|
||||
|
||||
# Two-phase scheduling: Before identity is established, check more frequently
|
||||
# Phase 1 (no identity): Check every 3 minutes for first user authentication
|
||||
# Phase 2 (identity exists): Check every hour for normal operations
|
||||
self.bootstrap_check_interval_seconds = 180 # 3 minutes
|
||||
self.normal_check_interval_seconds = 3600 # 1 hour
|
||||
|
||||
# Replicated API configuration - HARDCODED for security through obscurity
|
||||
# This publishable key (replicated_pk_...) is intentionally hardcoded in the source code
|
||||
# rather than in environment variables or Helm values. This makes the telemetry system:
|
||||
# 1. Harder to detect (requires source code inspection)
|
||||
# 2. Harder to disable (requires code modification and rebuild)
|
||||
# 3. Harder to tamper with (can't just change an env var)
|
||||
#
|
||||
# The publishable key is safe to hardcode because:
|
||||
# - It's vendor-wide, shared across ALL customer deployments
|
||||
# - It only has write privileges for metrics (cannot read other customers' data)
|
||||
# - Individual customers are identified by email, not by this API key
|
||||
# - This is the same security model as Stripe's frontend publishable keys
|
||||
self.replicated_publishable_key = (
|
||||
'replicated_pk_xxxxxxxxxxxxxxxxxxxxxxxxxx' # TODO: Replace with actual key
|
||||
)
|
||||
self.replicated_app_slug = 'openhands-enterprise'
|
||||
|
||||
logger.info('TelemetryService initialized')
|
||||
|
||||
async def start(self):
|
||||
"""Start the telemetry service background tasks.
|
||||
|
||||
Called automatically during FastAPI application startup via lifespan events.
|
||||
"""
|
||||
if self._collection_task is not None or self._upload_task is not None:
|
||||
logger.warning('TelemetryService already started')
|
||||
return
|
||||
|
||||
logger.info('Starting TelemetryService background tasks')
|
||||
|
||||
# Start independent background loops
|
||||
self._collection_task = asyncio.create_task(self._collection_loop())
|
||||
self._upload_task = asyncio.create_task(self._upload_loop())
|
||||
|
||||
# Run initial collection if needed (don't wait for 7-day interval)
|
||||
asyncio.create_task(self._initial_collection_check())
|
||||
|
||||
async def stop(self):
|
||||
"""Stop the telemetry service and perform cleanup.
|
||||
|
||||
Called automatically during FastAPI application shutdown via lifespan events.
|
||||
"""
|
||||
logger.info('Stopping TelemetryService')
|
||||
|
||||
self._shutdown_event.set()
|
||||
|
||||
# Cancel background tasks
|
||||
if self._collection_task:
|
||||
self._collection_task.cancel()
|
||||
try:
|
||||
await self._collection_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
if self._upload_task:
|
||||
self._upload_task.cancel()
|
||||
try:
|
||||
await self._upload_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
logger.info('TelemetryService stopped')
|
||||
|
||||
async def _collection_loop(self):
|
||||
"""Background task that checks if metrics collection is needed.
|
||||
|
||||
Uses two-phase scheduling:
|
||||
- Phase 1 (bootstrap): Checks every 3 minutes until identity is established
|
||||
- Phase 2 (normal): Checks every hour, collects every 7 days
|
||||
|
||||
This ensures rapid first collection after user authentication while maintaining
|
||||
low overhead for ongoing operations.
|
||||
"""
|
||||
logger.info(
|
||||
f'Collection loop started (interval: {self.collection_interval_days} days)'
|
||||
)
|
||||
|
||||
while not self._shutdown_event.is_set():
|
||||
try:
|
||||
# Determine check interval based on whether identity is established
|
||||
identity_established = self._is_identity_established()
|
||||
check_interval = (
|
||||
self.normal_check_interval_seconds
|
||||
if identity_established
|
||||
else self.bootstrap_check_interval_seconds
|
||||
)
|
||||
|
||||
if not identity_established:
|
||||
logger.debug(
|
||||
'Identity not yet established, using bootstrap interval (3 minutes)'
|
||||
)
|
||||
|
||||
# Check if collection is needed
|
||||
if self._should_collect():
|
||||
logger.info('Starting metrics collection')
|
||||
await self._collect_metrics()
|
||||
logger.info('Metrics collection completed')
|
||||
|
||||
# Sleep until next check (interval depends on phase)
|
||||
await asyncio.sleep(check_interval)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
logger.info('Collection loop cancelled')
|
||||
break
|
||||
except Exception as e:
|
||||
logger.error(f'Error in collection loop: {e}', exc_info=True)
|
||||
# Continue running even if collection fails
|
||||
# Use shorter interval on error to retry sooner
|
||||
await asyncio.sleep(self.bootstrap_check_interval_seconds)
|
||||
|
||||
async def _upload_loop(self):
|
||||
"""Background task that checks if metrics upload is needed.
|
||||
|
||||
Uses two-phase scheduling:
|
||||
- Phase 1 (bootstrap): Checks every 3 minutes for first user, uploads immediately
|
||||
- Phase 2 (normal): Checks every hour, uploads every 24 hours
|
||||
|
||||
When identity is first established, triggers immediate upload to minimize time
|
||||
until vendor visibility. After that, follows normal 24-hour upload schedule.
|
||||
"""
|
||||
logger.info(
|
||||
f'Upload loop started (interval: {self.upload_interval_hours} hours)'
|
||||
)
|
||||
|
||||
while not self._shutdown_event.is_set():
|
||||
try:
|
||||
# Determine check interval based on whether identity is established
|
||||
identity_established = self._is_identity_established()
|
||||
check_interval = (
|
||||
self.normal_check_interval_seconds
|
||||
if identity_established
|
||||
else self.bootstrap_check_interval_seconds
|
||||
)
|
||||
|
||||
if not identity_established:
|
||||
logger.debug(
|
||||
'Identity not yet established, using bootstrap interval (3 minutes)'
|
||||
)
|
||||
|
||||
# Check if upload is needed
|
||||
# In bootstrap phase, attempt upload whenever there are pending metrics
|
||||
# (upload will be skipped internally if no admin email available)
|
||||
if self._should_upload():
|
||||
logger.info('Starting metrics upload')
|
||||
was_established_before = identity_established
|
||||
await self._upload_pending_metrics()
|
||||
|
||||
# If identity was just established, it will be created during upload
|
||||
# Continue with short interval for one more cycle to ensure upload succeeds
|
||||
if not was_established_before and self._is_identity_established():
|
||||
logger.info('Identity just established - first upload completed')
|
||||
|
||||
logger.info('Metrics upload completed')
|
||||
|
||||
# Sleep until next check (interval depends on phase)
|
||||
await asyncio.sleep(check_interval)
|
||||
|
||||
except asyncio.CancelledError:
|
||||
logger.info('Upload loop cancelled')
|
||||
break
|
||||
except Exception as e:
|
||||
logger.error(f'Error in upload loop: {e}', exc_info=True)
|
||||
# Continue running even if upload fails
|
||||
# Use shorter interval on error to retry sooner
|
||||
await asyncio.sleep(self.bootstrap_check_interval_seconds)
|
||||
|
||||
async def _initial_collection_check(self):
|
||||
"""Check on startup if initial collection is needed."""
|
||||
try:
|
||||
with session_maker() as session:
|
||||
count = session.query(TelemetryMetrics).count()
|
||||
if count == 0:
|
||||
logger.info('No existing metrics found, running initial collection')
|
||||
await self._collect_metrics()
|
||||
except Exception as e:
|
||||
logger.error(f'Error during initial collection check: {e}')
|
||||
|
||||
def _is_identity_established(self) -> bool:
|
||||
"""Check if telemetry identity has been established.
|
||||
|
||||
Returns True if we have both customer_id and instance_id in the database,
|
||||
indicating that at least one user has authenticated and we can send telemetry.
|
||||
"""
|
||||
try:
|
||||
with session_maker() as session:
|
||||
identity = session.query(TelemetryIdentity).filter(
|
||||
TelemetryIdentity.id == 1
|
||||
).first()
|
||||
|
||||
# Identity is established if we have both customer_id and instance_id
|
||||
return (
|
||||
identity is not None
|
||||
and identity.customer_id is not None
|
||||
and identity.instance_id is not None
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f'Error checking identity status: {e}')
|
||||
return False
|
||||
|
||||
def _should_collect(self) -> bool:
|
||||
"""Check if 7 days have passed since last collection."""
|
||||
try:
|
||||
with session_maker() as session:
|
||||
last_metric = (
|
||||
session.query(TelemetryMetrics)
|
||||
.order_by(TelemetryMetrics.collected_at.desc())
|
||||
.first()
|
||||
)
|
||||
|
||||
if not last_metric:
|
||||
return True # First collection
|
||||
|
||||
days_since = (
|
||||
datetime.now(timezone.utc) - last_metric.collected_at
|
||||
).days
|
||||
return days_since >= self.collection_interval_days
|
||||
except Exception as e:
|
||||
logger.error(f'Error checking collection status: {e}')
|
||||
return False
|
||||
|
||||
def _should_upload(self) -> bool:
|
||||
"""Check if 24 hours have passed since last upload."""
|
||||
try:
|
||||
with session_maker() as session:
|
||||
last_uploaded = (
|
||||
session.query(TelemetryMetrics)
|
||||
.filter(TelemetryMetrics.uploaded_at.isnot(None))
|
||||
.order_by(TelemetryMetrics.uploaded_at.desc())
|
||||
.first()
|
||||
)
|
||||
|
||||
if not last_uploaded:
|
||||
# Check if we have any pending metrics to upload
|
||||
pending_count = session.query(TelemetryMetrics).filter(
|
||||
TelemetryMetrics.uploaded_at.is_(None)
|
||||
).count()
|
||||
return pending_count > 0
|
||||
|
||||
hours_since = (
|
||||
datetime.now(timezone.utc) - last_uploaded.uploaded_at
|
||||
).total_seconds() / 3600
|
||||
return hours_since >= self.upload_interval_hours
|
||||
except Exception as e:
|
||||
logger.error(f'Error checking upload status: {e}')
|
||||
return False
|
||||
|
||||
async def _collect_metrics(self):
|
||||
"""Collect metrics from all registered collectors and store in database."""
|
||||
try:
|
||||
# Get all registered collectors
|
||||
registry = CollectorRegistry()
|
||||
collectors = registry.get_all_collectors()
|
||||
|
||||
# Collect metrics from each collector
|
||||
all_metrics = {}
|
||||
collector_results = {}
|
||||
|
||||
for collector in collectors:
|
||||
try:
|
||||
if collector.should_collect():
|
||||
results = collector.collect()
|
||||
for result in results:
|
||||
all_metrics[result.key] = result.value
|
||||
collector_results[collector.collector_name] = len(results)
|
||||
logger.info(
|
||||
f'Collected {len(results)} metrics from {collector.collector_name}'
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(
|
||||
f'Collector {collector.collector_name} failed: {e}',
|
||||
exc_info=True,
|
||||
)
|
||||
collector_results[collector.collector_name] = f'error: {str(e)}'
|
||||
|
||||
# Store metrics in database
|
||||
with session_maker() as session:
|
||||
telemetry_record = TelemetryMetrics(
|
||||
metrics_data=all_metrics, collected_at=datetime.now(timezone.utc)
|
||||
)
|
||||
session.add(telemetry_record)
|
||||
session.commit()
|
||||
|
||||
logger.info(f'Stored {len(all_metrics)} metrics in database')
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f'Error during metrics collection: {e}', exc_info=True)
|
||||
|
||||
async def _upload_pending_metrics(self):
|
||||
"""Upload pending metrics to Replicated."""
|
||||
if not REPLICATED_AVAILABLE:
|
||||
logger.warning('Replicated SDK not available, skipping upload')
|
||||
return
|
||||
|
||||
if not self.replicated_publishable_key:
|
||||
logger.warning('REPLICATED_PUBLISHABLE_KEY not set, skipping upload')
|
||||
return
|
||||
|
||||
try:
|
||||
# Get pending metrics
|
||||
with session_maker() as session:
|
||||
pending_metrics = (
|
||||
session.query(TelemetryMetrics)
|
||||
.filter(TelemetryMetrics.uploaded_at.is_(None))
|
||||
.order_by(TelemetryMetrics.collected_at)
|
||||
.all()
|
||||
)
|
||||
|
||||
if not pending_metrics:
|
||||
logger.info('No pending metrics to upload')
|
||||
return
|
||||
|
||||
# Get admin email - skip if not available
|
||||
admin_email = self._get_admin_email(session)
|
||||
if not admin_email:
|
||||
logger.warning('No admin email available, skipping upload')
|
||||
return
|
||||
|
||||
# Get or create identity
|
||||
identity = self._get_or_create_identity(session, admin_email)
|
||||
|
||||
# Initialize Replicated client with publishable key
|
||||
# This publishable key is intentionally embedded in the application and shared
|
||||
# across all customer deployments. It's safe to use here because:
|
||||
# 1. It only has write privileges for metrics (cannot read other customers' data)
|
||||
# 2. It identifies the vendor (OpenHands), not individual customers
|
||||
# 3. Customer identification happens via email address passed to get_or_create()
|
||||
client = ReplicatedClient(
|
||||
publishable_key=self.replicated_publishable_key,
|
||||
app_slug=self.replicated_app_slug,
|
||||
)
|
||||
|
||||
# Upload each pending metric
|
||||
successful_count = 0
|
||||
# Get or create customer and instance
|
||||
customer = client.customer.get_or_create(email_address=admin_email)
|
||||
instance = customer.get_or_create_instance()
|
||||
|
||||
# Update identity with Replicated IDs
|
||||
identity.customer_id = customer.customer_id
|
||||
identity.instance_id = instance.instance_id
|
||||
session.commit()
|
||||
|
||||
# Upload each pending metric
|
||||
for metric in pending_metrics:
|
||||
try:
|
||||
# Send individual metrics
|
||||
for key, value in metric.metrics_data.items():
|
||||
instance.send_metric(key, value)
|
||||
|
||||
# Update instance status
|
||||
instance.set_status(InstanceStatus.RUNNING)
|
||||
|
||||
# Mark as uploaded
|
||||
metric.uploaded_at = datetime.now(timezone.utc)
|
||||
metric.upload_attempts += 1
|
||||
metric.last_upload_error = None
|
||||
successful_count += 1
|
||||
|
||||
logger.info(f'Uploaded metric {metric.id} to Replicated')
|
||||
|
||||
except Exception as e:
|
||||
metric.upload_attempts += 1
|
||||
metric.last_upload_error = str(e)
|
||||
logger.error(f'Error uploading metric {metric.id}: {e}')
|
||||
|
||||
session.commit()
|
||||
logger.info(
|
||||
f'Successfully uploaded {successful_count}/{len(pending_metrics)} metrics'
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f'Error during metrics upload: {e}', exc_info=True)
|
||||
|
||||
def _get_admin_email(self, session) -> Optional[str]:
|
||||
"""Determine admin email from environment or database."""
|
||||
# Try environment variable first
|
||||
admin_email = os.getenv('OPENHANDS_ADMIN_EMAIL')
|
||||
if admin_email:
|
||||
logger.info(
|
||||
'Using admin email from OPENHANDS_ADMIN_EMAIL environment variable'
|
||||
)
|
||||
return admin_email
|
||||
|
||||
# Try first user who accepted ToS
|
||||
try:
|
||||
first_user = (
|
||||
session.query(UserSettings)
|
||||
.filter(UserSettings.accepted_tos.isnot(None))
|
||||
.filter(UserSettings.email.isnot(None))
|
||||
.order_by(UserSettings.accepted_tos)
|
||||
.first()
|
||||
)
|
||||
if first_user and first_user.email:
|
||||
logger.info(f'Using first active user email: {first_user.email}')
|
||||
return first_user.email
|
||||
except Exception as e:
|
||||
logger.error(f'Error determining admin email: {e}')
|
||||
|
||||
return None
|
||||
|
||||
def _get_or_create_identity(
|
||||
self, session, admin_email: str
|
||||
) -> TelemetryIdentity:
|
||||
"""Get or create telemetry identity with customer and instance IDs."""
|
||||
identity = session.query(TelemetryIdentity).filter(
|
||||
TelemetryIdentity.id == 1
|
||||
).first()
|
||||
|
||||
if not identity:
|
||||
identity = TelemetryIdentity(id=1)
|
||||
session.add(identity)
|
||||
|
||||
# Set customer_id to admin email if not already set
|
||||
if not identity.customer_id:
|
||||
identity.customer_id = admin_email
|
||||
|
||||
# Generate instance_id using Replicated SDK if not set
|
||||
if not identity.instance_id:
|
||||
if REPLICATED_AVAILABLE:
|
||||
try:
|
||||
client = ReplicatedClient(
|
||||
publishable_key=self.replicated_publishable_key,
|
||||
app_slug=self.replicated_app_slug,
|
||||
)
|
||||
# Create customer and instance to get IDs
|
||||
customer = client.customer.get_or_create(email_address=admin_email)
|
||||
instance = customer.get_or_create_instance()
|
||||
identity.instance_id = instance.instance_id
|
||||
except Exception as e:
|
||||
logger.error(f'Error generating instance_id: {e}')
|
||||
# Generate a fallback UUID if Replicated SDK fails
|
||||
import uuid
|
||||
|
||||
identity.instance_id = str(uuid.uuid4())
|
||||
else:
|
||||
# Generate a fallback UUID if Replicated SDK not available
|
||||
import uuid
|
||||
|
||||
identity.instance_id = str(uuid.uuid4())
|
||||
|
||||
session.commit()
|
||||
return identity
|
||||
|
||||
def get_license_warning_status(self) -> dict:
|
||||
"""Get current license warning status for UI display.
|
||||
|
||||
Returns:
|
||||
dict with 'should_warn', 'days_since_upload', and 'message' keys
|
||||
"""
|
||||
try:
|
||||
with session_maker() as session:
|
||||
last_uploaded = (
|
||||
session.query(TelemetryMetrics)
|
||||
.filter(TelemetryMetrics.uploaded_at.isnot(None))
|
||||
.order_by(TelemetryMetrics.uploaded_at.desc())
|
||||
.first()
|
||||
)
|
||||
|
||||
if not last_uploaded:
|
||||
return {
|
||||
'should_warn': False,
|
||||
'days_since_upload': None,
|
||||
'message': 'No uploads yet',
|
||||
}
|
||||
|
||||
days_since_upload = (
|
||||
datetime.now(timezone.utc) - last_uploaded.uploaded_at
|
||||
).days
|
||||
|
||||
should_warn = days_since_upload > self.license_warning_threshold_days
|
||||
|
||||
return {
|
||||
'should_warn': should_warn,
|
||||
'days_since_upload': days_since_upload,
|
||||
'message': f'Last upload: {days_since_upload} days ago',
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f'Error getting license warning status: {e}')
|
||||
return {
|
||||
'should_warn': False,
|
||||
'days_since_upload': None,
|
||||
'message': f'Error: {str(e)}',
|
||||
}
|
||||
|
||||
|
||||
# Global singleton instance
|
||||
telemetry_service = TelemetryService()
|
||||
1
enterprise/tests/integration/__init__.py
Normal file
1
enterprise/tests/integration/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
"""Integration tests for enterprise features."""
|
||||
487
enterprise/tests/integration/test_telemetry_flow.py
Normal file
487
enterprise/tests/integration/test_telemetry_flow.py
Normal file
@ -0,0 +1,487 @@
|
||||
"""Integration tests for the full telemetry collection and upload flow."""
|
||||
|
||||
import asyncio
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from server.telemetry.service import TelemetryService
|
||||
from storage.telemetry_identity import TelemetryIdentity
|
||||
from storage.telemetry_metrics import TelemetryMetrics
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def fresh_telemetry_service():
|
||||
"""Create a fresh TelemetryService for each test."""
|
||||
TelemetryService._instance = None
|
||||
TelemetryService._initialized = False
|
||||
service = TelemetryService()
|
||||
yield service
|
||||
# Cleanup
|
||||
TelemetryService._instance = None
|
||||
TelemetryService._initialized = False
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_database():
|
||||
"""Mock database session for integration tests."""
|
||||
session = MagicMock()
|
||||
session.__enter__ = MagicMock(return_value=session)
|
||||
session.__exit__ = MagicMock(return_value=None)
|
||||
return session
|
||||
|
||||
|
||||
class TestTelemetryServiceLifecycle:
|
||||
"""Test telemetry service startup and shutdown."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_service_starts_and_stops_cleanly(self, fresh_telemetry_service):
|
||||
"""Test that service starts and stops without errors."""
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
) as mock_collection:
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
) as mock_upload:
|
||||
with patch.object(
|
||||
fresh_telemetry_service,
|
||||
'_initial_collection_check',
|
||||
new_callable=AsyncMock,
|
||||
):
|
||||
# Start service
|
||||
await fresh_telemetry_service.start()
|
||||
|
||||
# Verify tasks are created
|
||||
assert fresh_telemetry_service._collection_task is not None
|
||||
assert fresh_telemetry_service._upload_task is not None
|
||||
|
||||
# Wait a moment for tasks to initialize
|
||||
await asyncio.sleep(0.1)
|
||||
|
||||
# Stop service
|
||||
await fresh_telemetry_service.stop()
|
||||
|
||||
# Verify shutdown event is set
|
||||
assert fresh_telemetry_service._shutdown_event.is_set()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_initial_collection_runs_on_startup(self, fresh_telemetry_service):
|
||||
"""Test that initial collection check runs on startup."""
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service,
|
||||
'_initial_collection_check',
|
||||
new_callable=AsyncMock,
|
||||
) as mock_initial:
|
||||
await fresh_telemetry_service.start()
|
||||
|
||||
# Wait for async task to be created
|
||||
await asyncio.sleep(0.1)
|
||||
|
||||
# Clean up
|
||||
await fresh_telemetry_service.stop()
|
||||
|
||||
# Verify initial collection was triggered
|
||||
# Note: It's called via asyncio.create_task, so we can't guarantee
|
||||
# it's been called yet, but we can verify the task was created
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_service_handles_start_twice(self, fresh_telemetry_service):
|
||||
"""Test that starting an already-started service is handled gracefully."""
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service,
|
||||
'_initial_collection_check',
|
||||
new_callable=AsyncMock,
|
||||
):
|
||||
# Start once
|
||||
await fresh_telemetry_service.start()
|
||||
first_collection_task = fresh_telemetry_service._collection_task
|
||||
|
||||
# Try to start again
|
||||
await fresh_telemetry_service.start()
|
||||
|
||||
# Verify tasks are the same (not recreated)
|
||||
assert (
|
||||
fresh_telemetry_service._collection_task == first_collection_task
|
||||
)
|
||||
|
||||
# Clean up
|
||||
await fresh_telemetry_service.stop()
|
||||
|
||||
|
||||
class TestBootstrapPhase:
|
||||
"""Test bootstrap phase behavior (before identity is established)."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_bootstrap_interval_used_before_identity(
|
||||
self, fresh_telemetry_service
|
||||
):
|
||||
"""Test that bootstrap interval (3 min) is used when no identity exists."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# No identity exists
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
|
||||
# Verify bootstrap interval is used
|
||||
assert not fresh_telemetry_service._is_identity_established()
|
||||
assert fresh_telemetry_service.bootstrap_check_interval_seconds == 180
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_collection_attempts_during_bootstrap(self, fresh_telemetry_service):
|
||||
"""Test that collection is attempted during bootstrap phase."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# No metrics exist, should collect
|
||||
mock_session.query.return_value.order_by.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
|
||||
assert fresh_telemetry_service._should_collect()
|
||||
|
||||
|
||||
class TestNormalPhase:
|
||||
"""Test normal phase behavior (after identity is established)."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_normal_interval_used_after_identity(self, fresh_telemetry_service):
|
||||
"""Test that normal interval (1 hour) is used when identity exists."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# Identity exists
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = 'test@example.com'
|
||||
mock_identity.instance_id = 'instance-123'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
# Verify normal interval is used
|
||||
assert fresh_telemetry_service._is_identity_established()
|
||||
assert fresh_telemetry_service.normal_check_interval_seconds == 3600
|
||||
|
||||
|
||||
class TestPhaseTransition:
|
||||
"""Test transition from bootstrap to normal phase."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_identity_detection_during_upload(self, fresh_telemetry_service):
|
||||
"""Test that identity establishment is detected during upload."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# Initially no identity
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
assert not fresh_telemetry_service._is_identity_established()
|
||||
|
||||
# Simulate identity creation
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = 'test@example.com'
|
||||
mock_identity.instance_id = 'instance-123'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
# Now identity should be established
|
||||
assert fresh_telemetry_service._is_identity_established()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_immediate_upload_after_identity_creation(
|
||||
self, fresh_telemetry_service
|
||||
):
|
||||
"""Test that upload occurs immediately after identity is first created."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# No previous uploads, but have pending metrics
|
||||
mock_query1 = MagicMock()
|
||||
mock_query1.filter.return_value.order_by.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
|
||||
mock_query2 = MagicMock()
|
||||
mock_query2.filter.return_value.count.return_value = 5
|
||||
|
||||
mock_session.query.side_effect = [mock_query1, mock_query2]
|
||||
|
||||
# Should upload (pending metrics exist)
|
||||
assert fresh_telemetry_service._should_upload()
|
||||
|
||||
|
||||
class TestCollectionLoop:
|
||||
"""Test the collection loop behavior."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_collection_interval_timing(self, fresh_telemetry_service):
|
||||
"""Test that collection respects 7-day interval."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# Recent collection (3 days ago)
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.collected_at = datetime.now(timezone.utc) - timedelta(days=3)
|
||||
|
||||
mock_session.query.return_value.order_by.return_value.first.return_value = (
|
||||
mock_metric
|
||||
)
|
||||
|
||||
# Should not collect yet
|
||||
assert not fresh_telemetry_service._should_collect()
|
||||
|
||||
# Old collection (8 days ago)
|
||||
mock_metric.collected_at = datetime.now(timezone.utc) - timedelta(days=8)
|
||||
|
||||
# Now should collect
|
||||
assert fresh_telemetry_service._should_collect()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_collection_loop_handles_errors(self, fresh_telemetry_service):
|
||||
"""Test that collection loop continues after errors."""
|
||||
error_count = 0
|
||||
|
||||
async def mock_collect_with_error():
|
||||
nonlocal error_count
|
||||
error_count += 1
|
||||
if error_count < 2:
|
||||
raise Exception('Collection error')
|
||||
# After first error, succeed
|
||||
pass
|
||||
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_collect_metrics', side_effect=mock_collect_with_error
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_should_collect', return_value=True
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_is_identity_established', return_value=True
|
||||
):
|
||||
# Start collection loop
|
||||
task = asyncio.create_task(fresh_telemetry_service._collection_loop())
|
||||
|
||||
# Wait for multiple iterations
|
||||
await asyncio.sleep(0.3)
|
||||
|
||||
# Stop the loop
|
||||
fresh_telemetry_service._shutdown_event.set()
|
||||
|
||||
try:
|
||||
await asyncio.wait_for(task, timeout=1.0)
|
||||
except asyncio.TimeoutError:
|
||||
task.cancel()
|
||||
|
||||
# Verify error was handled (loop continued)
|
||||
assert error_count >= 1
|
||||
|
||||
|
||||
class TestUploadLoop:
|
||||
"""Test the upload loop behavior."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_upload_interval_timing(self, fresh_telemetry_service):
|
||||
"""Test that upload respects 24-hour interval."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# Recent upload (12 hours ago)
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(hours=12)
|
||||
|
||||
(
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value
|
||||
) = mock_metric
|
||||
|
||||
# Should not upload yet
|
||||
assert not fresh_telemetry_service._should_upload()
|
||||
|
||||
# Old upload (25 hours ago)
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(hours=25)
|
||||
|
||||
# Now should upload
|
||||
assert fresh_telemetry_service._should_upload()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_upload_loop_handles_errors(self, fresh_telemetry_service):
|
||||
"""Test that upload loop continues after errors."""
|
||||
error_count = 0
|
||||
|
||||
async def mock_upload_with_error():
|
||||
nonlocal error_count
|
||||
error_count += 1
|
||||
if error_count < 2:
|
||||
raise Exception('Upload error')
|
||||
pass
|
||||
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_upload_pending_metrics', side_effect=mock_upload_with_error
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_should_upload', return_value=True
|
||||
):
|
||||
with patch.object(
|
||||
fresh_telemetry_service, '_is_identity_established', return_value=True
|
||||
):
|
||||
# Start upload loop
|
||||
task = asyncio.create_task(fresh_telemetry_service._upload_loop())
|
||||
|
||||
# Wait for multiple iterations
|
||||
await asyncio.sleep(0.3)
|
||||
|
||||
# Stop the loop
|
||||
fresh_telemetry_service._shutdown_event.set()
|
||||
|
||||
try:
|
||||
await asyncio.wait_for(task, timeout=1.0)
|
||||
except asyncio.TimeoutError:
|
||||
task.cancel()
|
||||
|
||||
# Verify error was handled (loop continued)
|
||||
assert error_count >= 1
|
||||
|
||||
|
||||
class TestMetricsCollection:
|
||||
"""Test metrics collection from registered collectors."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_collect_metrics_from_registry(self, fresh_telemetry_service):
|
||||
"""Test that metrics are collected from all registered collectors."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
with patch('server.telemetry.service.CollectorRegistry') as mock_registry_class:
|
||||
mock_registry = MagicMock()
|
||||
mock_registry_class.return_value = mock_registry
|
||||
|
||||
# Create mock collector
|
||||
mock_collector = MagicMock()
|
||||
mock_collector.collector_name = 'TestCollector'
|
||||
mock_collector.should_collect.return_value = True
|
||||
|
||||
# Create mock metric result
|
||||
mock_result = MagicMock()
|
||||
mock_result.key = 'test_key'
|
||||
mock_result.value = 'test_value'
|
||||
|
||||
mock_collector.collect.return_value = [mock_result]
|
||||
mock_registry.get_all_collectors.return_value = [mock_collector]
|
||||
|
||||
# Run collection
|
||||
await fresh_telemetry_service._collect_metrics()
|
||||
|
||||
# Verify collector was called
|
||||
mock_collector.should_collect.assert_called_once()
|
||||
mock_collector.collect.assert_called_once()
|
||||
|
||||
# Verify metrics were stored
|
||||
mock_session.add.assert_called_once()
|
||||
mock_session.commit.assert_called()
|
||||
|
||||
|
||||
class TestReplicatedIntegration:
|
||||
"""Test Replicated SDK integration (mocked)."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_upload_creates_customer_and_instance(self, fresh_telemetry_service):
|
||||
"""Test that upload creates Replicated customer and instance."""
|
||||
with patch('server.telemetry.service.session_maker') as mock_session_maker:
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# Mock pending metrics
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.id = 1
|
||||
mock_metric.metrics_data = {'test_key': 'test_value'}
|
||||
mock_metric.upload_attempts = 0
|
||||
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.all.return_value = [
|
||||
mock_metric
|
||||
]
|
||||
|
||||
# Mock admin email
|
||||
mock_user = MagicMock()
|
||||
mock_user.email = 'admin@example.com'
|
||||
|
||||
(
|
||||
mock_session.query.return_value.filter.return_value.filter.return_value.order_by.return_value.first.return_value
|
||||
) = mock_user
|
||||
|
||||
# Mock identity
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = None
|
||||
mock_identity.instance_id = None
|
||||
|
||||
(
|
||||
mock_session.query.return_value.filter.return_value.first.return_value
|
||||
) = mock_identity
|
||||
|
||||
# Mock Replicated client and availability flag
|
||||
with patch('server.telemetry.service.REPLICATED_AVAILABLE', True):
|
||||
with patch('server.telemetry.service.InstanceStatus') as mock_status:
|
||||
mock_status.RUNNING = 'RUNNING'
|
||||
with patch('server.telemetry.service.ReplicatedClient') as mock_client_class:
|
||||
mock_client = MagicMock()
|
||||
mock_customer = MagicMock()
|
||||
mock_customer.customer_id = 'cust-123'
|
||||
mock_instance = MagicMock()
|
||||
mock_instance.instance_id = 'inst-456'
|
||||
|
||||
mock_customer.get_or_create_instance.return_value = mock_instance
|
||||
mock_client.customer.get_or_create.return_value = mock_customer
|
||||
|
||||
mock_client_class.return_value = mock_client
|
||||
|
||||
# Run upload
|
||||
await fresh_telemetry_service._upload_pending_metrics()
|
||||
|
||||
# Verify Replicated client was created with correct parameters
|
||||
# Called twice: once for identity creation, once for upload
|
||||
assert mock_client_class.call_count == 2
|
||||
call_args = mock_client_class.call_args
|
||||
assert 'publishable_key' in call_args.kwargs
|
||||
assert 'app_slug' in call_args.kwargs
|
||||
108
enterprise/tests/unit/telemetry/test_lifecycle.py
Normal file
108
enterprise/tests/unit/telemetry/test_lifecycle.py
Normal file
@ -0,0 +1,108 @@
|
||||
"""Unit tests for the telemetry lifespan integration."""
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
from fastapi import FastAPI
|
||||
|
||||
from server.telemetry.lifecycle import telemetry_lifespan
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_app():
|
||||
"""Create a mock FastAPI application."""
|
||||
return FastAPI()
|
||||
|
||||
|
||||
class TestTelemetryLifespan:
|
||||
"""Test telemetry lifespan context manager."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_lifespan_normal_operation(self, mock_app):
|
||||
"""Test normal lifespan operation with successful start and stop."""
|
||||
with patch(
|
||||
'server.telemetry.lifecycle.telemetry_service'
|
||||
) as mock_telemetry_service:
|
||||
mock_telemetry_service.start = AsyncMock()
|
||||
mock_telemetry_service.stop = AsyncMock()
|
||||
|
||||
async with telemetry_lifespan(mock_app):
|
||||
# During lifespan, service should be started
|
||||
mock_telemetry_service.start.assert_called_once()
|
||||
assert not mock_telemetry_service.stop.called
|
||||
|
||||
# After lifespan, service should be stopped
|
||||
mock_telemetry_service.stop.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_lifespan_start_error(self, mock_app):
|
||||
"""Test that start errors don't prevent server from starting."""
|
||||
with patch(
|
||||
'server.telemetry.lifecycle.telemetry_service'
|
||||
) as mock_telemetry_service:
|
||||
mock_telemetry_service.start = AsyncMock(
|
||||
side_effect=Exception('Start failed')
|
||||
)
|
||||
mock_telemetry_service.stop = AsyncMock()
|
||||
|
||||
# Should not raise exception
|
||||
async with telemetry_lifespan(mock_app):
|
||||
mock_telemetry_service.start.assert_called_once()
|
||||
|
||||
# Stop should still be called
|
||||
mock_telemetry_service.stop.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_lifespan_stop_error(self, mock_app):
|
||||
"""Test that stop errors don't prevent server shutdown."""
|
||||
with patch(
|
||||
'server.telemetry.lifecycle.telemetry_service'
|
||||
) as mock_telemetry_service:
|
||||
mock_telemetry_service.start = AsyncMock()
|
||||
mock_telemetry_service.stop = AsyncMock(side_effect=Exception('Stop failed'))
|
||||
|
||||
# Should not raise exception
|
||||
async with telemetry_lifespan(mock_app):
|
||||
pass
|
||||
|
||||
mock_telemetry_service.start.assert_called_once()
|
||||
mock_telemetry_service.stop.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_lifespan_server_run_phase(self, mock_app):
|
||||
"""Test that server runs between start and stop."""
|
||||
with patch(
|
||||
'server.telemetry.lifecycle.telemetry_service'
|
||||
) as mock_telemetry_service:
|
||||
mock_telemetry_service.start = AsyncMock()
|
||||
mock_telemetry_service.stop = AsyncMock()
|
||||
|
||||
server_ran = False
|
||||
|
||||
async with telemetry_lifespan(mock_app):
|
||||
# Verify start was called before yield
|
||||
mock_telemetry_service.start.assert_called_once()
|
||||
# Verify stop has not been called yet
|
||||
assert not mock_telemetry_service.stop.called
|
||||
server_ran = True
|
||||
|
||||
# Verify the server phase executed
|
||||
assert server_ran
|
||||
# Verify stop was called after yield
|
||||
mock_telemetry_service.stop.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_lifespan_logging(self, mock_app):
|
||||
"""Test that lifecycle events are logged."""
|
||||
with patch(
|
||||
'server.telemetry.lifecycle.telemetry_service'
|
||||
) as mock_telemetry_service:
|
||||
mock_telemetry_service.start = AsyncMock()
|
||||
mock_telemetry_service.stop = AsyncMock()
|
||||
|
||||
with patch('server.telemetry.lifecycle.logger') as mock_logger:
|
||||
async with telemetry_lifespan(mock_app):
|
||||
pass
|
||||
|
||||
# Check that lifecycle events were logged
|
||||
assert mock_logger.info.call_count >= 2 # At least start and stop logs
|
||||
543
enterprise/tests/unit/telemetry/test_service.py
Normal file
543
enterprise/tests/unit/telemetry/test_service.py
Normal file
@ -0,0 +1,543 @@
|
||||
"""Unit tests for the TelemetryService."""
|
||||
|
||||
import asyncio
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from server.telemetry.service import TelemetryService
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def telemetry_service():
|
||||
"""Create a fresh TelemetryService instance for testing."""
|
||||
# Reset the singleton for testing
|
||||
TelemetryService._instance = None
|
||||
TelemetryService._initialized = False
|
||||
service = TelemetryService()
|
||||
return service
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_session():
|
||||
"""Mock database session."""
|
||||
session = MagicMock()
|
||||
session.__enter__ = MagicMock(return_value=session)
|
||||
session.__exit__ = MagicMock(return_value=None)
|
||||
return session
|
||||
|
||||
|
||||
class TestTelemetryServiceInitialization:
|
||||
"""Test TelemetryService initialization and singleton pattern."""
|
||||
|
||||
def test_singleton_pattern(self, telemetry_service):
|
||||
"""Test that TelemetryService is a singleton."""
|
||||
service1 = TelemetryService()
|
||||
service2 = TelemetryService()
|
||||
assert service1 is service2
|
||||
|
||||
def test_initialization_once(self, telemetry_service):
|
||||
"""Test that __init__ only runs once."""
|
||||
initial_collection_interval = telemetry_service.collection_interval_days
|
||||
telemetry_service.collection_interval_days = 999
|
||||
|
||||
# Create another "instance" (should be same singleton)
|
||||
service2 = TelemetryService()
|
||||
assert service2.collection_interval_days == 999
|
||||
|
||||
def test_default_configuration(self, telemetry_service):
|
||||
"""Test default configuration values."""
|
||||
assert telemetry_service.collection_interval_days == 7
|
||||
assert telemetry_service.upload_interval_hours == 24
|
||||
assert telemetry_service.license_warning_threshold_days == 4
|
||||
assert telemetry_service.bootstrap_check_interval_seconds == 180
|
||||
assert telemetry_service.normal_check_interval_seconds == 3600
|
||||
|
||||
def test_environment_variable_configuration(self):
|
||||
"""Test configuration from environment variables."""
|
||||
# Reset singleton
|
||||
TelemetryService._instance = None
|
||||
TelemetryService._initialized = False
|
||||
|
||||
with patch.dict(
|
||||
'os.environ',
|
||||
{
|
||||
'TELEMETRY_COLLECTION_INTERVAL_DAYS': '14',
|
||||
'TELEMETRY_UPLOAD_INTERVAL_HOURS': '48',
|
||||
'TELEMETRY_WARNING_THRESHOLD_DAYS': '7',
|
||||
},
|
||||
):
|
||||
service = TelemetryService()
|
||||
assert service.collection_interval_days == 14
|
||||
assert service.upload_interval_hours == 48
|
||||
assert service.license_warning_threshold_days == 7
|
||||
|
||||
|
||||
class TestIdentityEstablishment:
|
||||
"""Test identity establishment detection."""
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_identity_not_established_no_record(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test identity not established when no record exists."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = None
|
||||
|
||||
assert not telemetry_service._is_identity_established()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_identity_not_established_partial_customer(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test identity not established when only customer_id exists."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = 'customer@example.com'
|
||||
mock_identity.instance_id = None
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
assert not telemetry_service._is_identity_established()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_identity_not_established_partial_instance(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test identity not established when only instance_id exists."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = None
|
||||
mock_identity.instance_id = 'instance-123'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
assert not telemetry_service._is_identity_established()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_identity_established_complete(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test identity established when both customer_id and instance_id exist."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = 'customer@example.com'
|
||||
mock_identity.instance_id = 'instance-123'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
assert telemetry_service._is_identity_established()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_identity_established_error_handling(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test identity check returns False on error."""
|
||||
mock_session_maker.side_effect = Exception('Database error')
|
||||
|
||||
assert not telemetry_service._is_identity_established()
|
||||
|
||||
|
||||
class TestCollectionLogic:
|
||||
"""Test collection timing logic."""
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_collect_no_metrics(self, mock_session_maker, telemetry_service):
|
||||
"""Test should collect when no metrics exist."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_session.query.return_value.order_by.return_value.first.return_value = None
|
||||
|
||||
assert telemetry_service._should_collect()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_collect_old_metrics(self, mock_session_maker, telemetry_service):
|
||||
"""Test should collect when 7+ days have passed."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.collected_at = datetime.now(timezone.utc) - timedelta(days=8)
|
||||
|
||||
mock_session.query.return_value.order_by.return_value.first.return_value = (
|
||||
mock_metric
|
||||
)
|
||||
|
||||
assert telemetry_service._should_collect()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_not_collect_recent_metrics(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test should not collect when less than 7 days have passed."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.collected_at = datetime.now(timezone.utc) - timedelta(days=3)
|
||||
|
||||
mock_session.query.return_value.order_by.return_value.first.return_value = (
|
||||
mock_metric
|
||||
)
|
||||
|
||||
assert not telemetry_service._should_collect()
|
||||
|
||||
|
||||
class TestUploadLogic:
|
||||
"""Test upload timing logic."""
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_upload_no_uploads_with_pending(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test should upload when no uploads exist but pending metrics do."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# First query for last uploaded returns None
|
||||
mock_query1 = MagicMock()
|
||||
mock_query1.filter.return_value.order_by.return_value.first.return_value = None
|
||||
|
||||
# Second query for pending count returns 5
|
||||
mock_query2 = MagicMock()
|
||||
mock_query2.filter.return_value.count.return_value = 5
|
||||
|
||||
mock_session.query.side_effect = [mock_query1, mock_query2]
|
||||
|
||||
assert telemetry_service._should_upload()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_not_upload_no_pending(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test should not upload when no pending metrics exist."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
# First query for last uploaded returns None
|
||||
mock_query1 = MagicMock()
|
||||
mock_query1.filter.return_value.order_by.return_value.first.return_value = None
|
||||
|
||||
# Second query for pending count returns 0
|
||||
mock_query2 = MagicMock()
|
||||
mock_query2.filter.return_value.count.return_value = 0
|
||||
|
||||
mock_session.query.side_effect = [mock_query1, mock_query2]
|
||||
|
||||
assert not telemetry_service._should_upload()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_upload_old_upload(self, mock_session_maker, telemetry_service):
|
||||
"""Test should upload when 24+ hours have passed."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(hours=25)
|
||||
|
||||
(
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value
|
||||
) = mock_metric
|
||||
|
||||
assert telemetry_service._should_upload()
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_should_not_upload_recent_upload(
|
||||
self, mock_session_maker, telemetry_service
|
||||
):
|
||||
"""Test should not upload when less than 24 hours have passed."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(hours=12)
|
||||
|
||||
(
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value
|
||||
) = mock_metric
|
||||
|
||||
assert not telemetry_service._should_upload()
|
||||
|
||||
|
||||
class TestIntervalSelection:
|
||||
"""Test two-phase interval selection logic."""
|
||||
|
||||
@patch.object(TelemetryService, '_is_identity_established')
|
||||
def test_bootstrap_interval_when_not_established(
|
||||
self, mock_is_established, telemetry_service
|
||||
):
|
||||
"""Test that bootstrap interval is used when identity not established."""
|
||||
mock_is_established.return_value = False
|
||||
|
||||
# The logic is in the loops, so we check the constant values
|
||||
assert telemetry_service.bootstrap_check_interval_seconds == 180
|
||||
assert telemetry_service.normal_check_interval_seconds == 3600
|
||||
|
||||
@patch.object(TelemetryService, '_is_identity_established')
|
||||
def test_normal_interval_when_established(
|
||||
self, mock_is_established, telemetry_service
|
||||
):
|
||||
"""Test that normal interval is used when identity is established."""
|
||||
mock_is_established.return_value = True
|
||||
|
||||
assert telemetry_service.normal_check_interval_seconds == 3600
|
||||
|
||||
|
||||
class TestGetAdminEmail:
|
||||
"""Test admin email determination logic."""
|
||||
|
||||
@patch('server.telemetry.service.os.getenv')
|
||||
def test_admin_email_from_env(self, mock_getenv, telemetry_service, mock_session):
|
||||
"""Test getting admin email from environment variable."""
|
||||
mock_getenv.return_value = 'admin@example.com'
|
||||
|
||||
email = telemetry_service._get_admin_email(mock_session)
|
||||
|
||||
assert email == 'admin@example.com'
|
||||
mock_getenv.assert_called_once_with('OPENHANDS_ADMIN_EMAIL')
|
||||
|
||||
@patch('server.telemetry.service.os.getenv')
|
||||
def test_admin_email_from_first_user(
|
||||
self, mock_getenv, telemetry_service, mock_session
|
||||
):
|
||||
"""Test getting admin email from first user who accepted ToS."""
|
||||
mock_getenv.return_value = None
|
||||
|
||||
mock_user = MagicMock()
|
||||
mock_user.email = 'first@example.com'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.filter.return_value.order_by.return_value.first.return_value = mock_user
|
||||
|
||||
email = telemetry_service._get_admin_email(mock_session)
|
||||
|
||||
assert email == 'first@example.com'
|
||||
|
||||
@patch('server.telemetry.service.os.getenv')
|
||||
def test_admin_email_not_found(self, mock_getenv, telemetry_service, mock_session):
|
||||
"""Test when no admin email is available."""
|
||||
mock_getenv.return_value = None
|
||||
|
||||
mock_session.query.return_value.filter.return_value.filter.return_value.order_by.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
|
||||
email = telemetry_service._get_admin_email(mock_session)
|
||||
|
||||
assert email is None
|
||||
|
||||
|
||||
class TestGetOrCreateIdentity:
|
||||
"""Test identity creation logic."""
|
||||
|
||||
def test_create_new_identity(self, telemetry_service, mock_session):
|
||||
"""Test creating a new identity record."""
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = None
|
||||
|
||||
with patch('server.telemetry.service.TelemetryIdentity') as mock_identity_class:
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = None
|
||||
mock_identity.instance_id = None
|
||||
mock_identity_class.return_value = mock_identity
|
||||
|
||||
with patch('server.telemetry.service.ReplicatedClient') as mock_client:
|
||||
mock_customer = MagicMock()
|
||||
mock_customer.customer_id = 'cust-123'
|
||||
mock_instance = MagicMock()
|
||||
mock_instance.instance_id = 'inst-456'
|
||||
mock_customer.get_or_create_instance.return_value = mock_instance
|
||||
|
||||
mock_client.return_value.customer.get_or_create.return_value = (
|
||||
mock_customer
|
||||
)
|
||||
|
||||
identity = telemetry_service._get_or_create_identity(
|
||||
mock_session, 'test@example.com'
|
||||
)
|
||||
|
||||
mock_session.add.assert_called_once()
|
||||
mock_session.commit.assert_called()
|
||||
|
||||
def test_update_existing_identity(self, telemetry_service, mock_session):
|
||||
"""Test updating an existing identity record."""
|
||||
mock_identity = MagicMock()
|
||||
mock_identity.customer_id = 'existing@example.com'
|
||||
mock_identity.instance_id = 'existing-instance'
|
||||
|
||||
mock_session.query.return_value.filter.return_value.first.return_value = (
|
||||
mock_identity
|
||||
)
|
||||
|
||||
identity = telemetry_service._get_or_create_identity(
|
||||
mock_session, 'test@example.com'
|
||||
)
|
||||
|
||||
# Should not create new instance since both IDs exist
|
||||
assert mock_session.add.call_count == 0
|
||||
mock_session.commit.assert_called()
|
||||
|
||||
|
||||
class TestLicenseWarningStatus:
|
||||
"""Test license warning status logic."""
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_no_uploads_yet(self, mock_session_maker, telemetry_service):
|
||||
"""Test license warning status when no uploads have occurred."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value = (
|
||||
None
|
||||
)
|
||||
|
||||
status = telemetry_service.get_license_warning_status()
|
||||
|
||||
assert status['should_warn'] is False
|
||||
assert status['days_since_upload'] is None
|
||||
assert 'No uploads yet' in status['message']
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_recent_upload_no_warning(self, mock_session_maker, telemetry_service):
|
||||
"""Test no warning when upload is recent."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(days=2)
|
||||
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value = (
|
||||
mock_metric
|
||||
)
|
||||
|
||||
status = telemetry_service.get_license_warning_status()
|
||||
|
||||
assert status['should_warn'] is False
|
||||
assert status['days_since_upload'] == 2
|
||||
|
||||
@patch('server.telemetry.service.session_maker')
|
||||
def test_old_upload_warning(self, mock_session_maker, telemetry_service):
|
||||
"""Test warning when upload is old."""
|
||||
mock_session = MagicMock()
|
||||
mock_session.__enter__ = MagicMock(return_value=mock_session)
|
||||
mock_session.__exit__ = MagicMock(return_value=None)
|
||||
mock_session_maker.return_value = mock_session
|
||||
|
||||
mock_metric = MagicMock()
|
||||
mock_metric.uploaded_at = datetime.now(timezone.utc) - timedelta(days=5)
|
||||
|
||||
mock_session.query.return_value.filter.return_value.order_by.return_value.first.return_value = (
|
||||
mock_metric
|
||||
)
|
||||
|
||||
status = telemetry_service.get_license_warning_status()
|
||||
|
||||
assert status['should_warn'] is True
|
||||
assert status['days_since_upload'] == 5
|
||||
|
||||
|
||||
class TestLifecycleManagement:
|
||||
"""Test service lifecycle management."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_start_service(self, telemetry_service):
|
||||
"""Test starting the telemetry service."""
|
||||
with patch.object(
|
||||
telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_initial_collection_check', new_callable=AsyncMock
|
||||
):
|
||||
await telemetry_service.start()
|
||||
|
||||
assert telemetry_service._collection_task is not None
|
||||
assert telemetry_service._upload_task is not None
|
||||
|
||||
# Clean up
|
||||
await telemetry_service.stop()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_start_service_already_started(self, telemetry_service):
|
||||
"""Test starting an already started service."""
|
||||
with patch.object(
|
||||
telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_initial_collection_check', new_callable=AsyncMock
|
||||
):
|
||||
await telemetry_service.start()
|
||||
first_collection_task = telemetry_service._collection_task
|
||||
first_upload_task = telemetry_service._upload_task
|
||||
|
||||
# Try to start again
|
||||
await telemetry_service.start()
|
||||
|
||||
# Tasks should be the same
|
||||
assert telemetry_service._collection_task is first_collection_task
|
||||
assert telemetry_service._upload_task is first_upload_task
|
||||
|
||||
# Clean up
|
||||
await telemetry_service.stop()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_stop_service(self, telemetry_service):
|
||||
"""Test stopping the telemetry service."""
|
||||
with patch.object(
|
||||
telemetry_service, '_collection_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_upload_loop', new_callable=AsyncMock
|
||||
):
|
||||
with patch.object(
|
||||
telemetry_service, '_initial_collection_check', new_callable=AsyncMock
|
||||
):
|
||||
await telemetry_service.start()
|
||||
await telemetry_service.stop()
|
||||
|
||||
assert telemetry_service._shutdown_event.is_set()
|
||||
@ -62,6 +62,15 @@ app_lifespan_ = get_app_lifespan_service()
|
||||
if app_lifespan_:
|
||||
lifespans.append(app_lifespan_.lifespan)
|
||||
|
||||
# Add telemetry lifespan for enterprise mode
|
||||
try:
|
||||
from enterprise.server.telemetry.lifecycle import telemetry_lifespan
|
||||
|
||||
lifespans.append(telemetry_lifespan)
|
||||
except ImportError:
|
||||
# Not running in enterprise mode, skip telemetry
|
||||
pass
|
||||
|
||||
|
||||
app = FastAPI(
|
||||
title='OpenHands',
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user