MULTI_CLOUD_GUIDE
Multi-Cloud Setup and Usage Guide
Overview
AIOSx supports multi-cloud orchestration across AWS, Azure, GCP, CoreWeave, and private datacenters. The UAICP cloud layer provides unified abstraction for managing workloads across all cloud providers.
Configuration
1. Cloud Provider Credentials
Edit config/cloud_providers.yaml and set credentials for each provider:
yamlcloud_providers:aws:enabled: truecredentials:access_key_id: "${AWS_ACCESS_KEY_ID}"secret_access_key: "${AWS_SECRET_ACCESS_KEY}"region: "us-east-1"azure:enabled: truecredentials:subscription_id: "${AZURE_SUBSCRIPTION_ID}"tenant_id: "${AZURE_TENANT_ID}"client_id: "${AZURE_CLIENT_ID}"client_secret: "${AZURE_CLIENT_SECRET}"gcp:enabled: truecredentials:project_id: "${GCP_PROJECT_ID}"service_account_key: "${GCP_SERVICE_ACCOUNT_KEY}"coreweave:enabled: truecredentials:api_key: "${COREWEAVE_API_KEY}"api_url: "${COREWEAVE_API_URL}"private:enabled: truecredentials:endpoint: "${PRIVATE_DC_ENDPOINT}"api_key: "${PRIVATE_DC_API_KEY}"
2. Environment Variables
Set the following environment variables:
bashexport AWS_ACCESS_KEY_ID="your-aws-key"export AWS_SECRET_ACCESS_KEY="your-aws-secret"export AZURE_SUBSCRIPTION_ID="your-azure-subscription"export AZURE_TENANT_ID="your-azure-tenant"export AZURE_CLIENT_ID="your-azure-client-id"export AZURE_CLIENT_SECRET="your-azure-client-secret"export GCP_PROJECT_ID="your-gcp-project"export GCP_SERVICE_ACCOUNT_KEY="path-to-service-account-key.json"export COREWEAVE_API_KEY="your-coreweave-key"export COREWEAVE_API_URL="https://api.coreweave.com"export PRIVATE_DC_ENDPOINT="https://your-datacenter.com/api"export PRIVATE_DC_API_KEY="your-datacenter-key"
Usage
Listing Available Clouds
python# Via APIGET /uaicp/clouds# Response{"providers": ["aws", "azure", "gcp", "coreweave", "private"],"regions": [...],"total_providers": 5,"total_regions": 15}
Routing Workloads
python# Route workload to optimal cloudPOST /uaicp/workloads/route{"task_id": "task_123","task_type": "llm_inference","domain": "llm","requirements": {"cloud_preference": "aws", # Optional"preferred_region": "us-east-1","cost_constraint": 5.0,"latency_slo_ms": 500}}
Workload Router
The workload router automatically selects the best cloud based on:
- Cost: Selects cloud with lowest cost within constraints
- Latency: Prefers clouds with lower latency
- Availability: Only considers available resources
- Region: Honors preferred region if specified
Manual Cloud Selection
pythonfrom aiosx.uaicp.cloud.cloud_registry import CloudRegistryfrom aiosx.uaicp.cloud.aws_provider import AWSProvider# Register providercloud_registry = CloudRegistry()aws_provider = AWSProvider()cloud_registry.register_provider(aws_provider)# List resourcesresources = await cloud_registry.list_all_resources(provider="aws",region="us-east-1",resource_type="gpu")# Find best resourcebest_resource = await cloud_registry.find_best_resource(resource_type="gpu",requirements={"max_cost_per_hour": 5.0,"preferred_region": "us-east-1"})
Cloud-Specific Features
AWS
- EC2 instances
- EKS clusters
- Lambda functions
- SageMaker endpoints
Azure
- AKS clusters
- Azure Functions
- Azure ML
GCP
- GKE clusters
- Cloud Functions
- Vertex AI
CoreWeave
- GPU-focused infrastructure
- Competitive GPU pricing
- High availability
Private Datacenters
- On-premises infrastructure
- Low latency
- Cost-effective for high-volume workloads
Best Practices
- Multi-Region: Deploy across multiple regions for high availability
- Cost Optimization: Use workload router to automatically select cost-effective clouds
- Latency: Route latency-sensitive workloads to nearest region
- Redundancy: Use distributed execution strategy for critical workloads
- Monitoring: Monitor costs and performance across all clouds
Troubleshooting
Provider Not Available
If a cloud provider is not available:
- Check credentials in
config/cloud_providers.yaml - Verify environment variables are set
- Check provider-specific API status
- Review logs for authentication errors
Resource Allocation Failures
If resource allocation fails:
- Check provider quotas
- Verify region availability
- Review cost constraints
- Check resource availability
High Costs
To reduce costs:
- Use private datacenters for high-volume workloads
- Leverage spot instances where available
- Optimize resource allocation with ROI engine
- Use cost constraints in workload routing