# HuggingFace Space Deployment Instructions ## 1. Create Space on HuggingFace 1. Go to https://huggingface.co/new-space 2. Fill in details: - **Owner**: `appsmithery` (or your organization) - **Space name**: `code-chef-modelops-trainer` - **License**: `apache-2.0` - **SDK**: `Gradio` - **Hardware**: `t4-small` (upgrade to `a10g-large` for 3-7B models) - **Visibility**: `Private` (recommended) or `Public` ## 2. Configure Secrets In Space Settings > Variables and secrets: 1. Add secret: `HF_TOKEN` - Value: Your HuggingFace write access token from https://huggingface.co/settings/tokens - Required permissions: `write` (for pushing trained models) ## 3. Upload Files Upload these files to the Space repository: ``` code-chef-modelops-trainer/ ├── app.py # Main application ├── requirements.txt # Python dependencies └── README.md # Space documentation ``` **Option A: Via Web UI** - Drag and drop files to Space Files tab **Option B: Via Git** ```bash # Clone the Space repo git clone https://huggingface.co/spaces/appsmithery/code-chef-modelops-trainer cd code-chef-modelops-trainer # Copy files cp deploy/huggingface-spaces/modelops-trainer/* . # Commit and push git add . git commit -m "Initial ModelOps trainer deployment" git push ``` ## 4. Verify Deployment 1. Wait for Space to build (2-3 minutes) 2. Check logs for errors 3. Test health endpoint: ```bash curl https://appsmithery-code-chef-modelops-trainer.hf.space/health ``` Expected response: ```json { "status": "healthy", "service": "code-chef-modelops-trainer", "autotrain_available": true, "hf_token_configured": true } ``` ## 5. Update code-chef Configuration Add Space URL to `config/env/.env`: ```bash # ModelOps - HuggingFace Space MODELOPS_SPACE_URL=https://appsmithery-code-chef-modelops-trainer.hf.space MODELOPS_SPACE_TOKEN=your_hf_token_here ``` ## 6. Test from code-chef Use the client example: ```python from deploy.huggingface_spaces.modelops_trainer.client_example import ModelOpsTrainerClient client = ModelOpsTrainerClient( space_url=os.environ["MODELOPS_SPACE_URL"], hf_token=os.environ["MODELOPS_SPACE_TOKEN"] ) # Health check health = client.health_check() print(health) # Submit demo job result = client.submit_training_job( agent_name="feature_dev", base_model="Qwen/Qwen2.5-Coder-7B", dataset_csv_path="/tmp/demo.csv", demo_mode=True ) print(f"Job ID: {result['job_id']}") ``` ## 7. Hardware Upgrades For larger models (3-7B), upgrade hardware: 1. Go to Space Settings 2. Change Hardware to `a10g-large` 3. Note: Cost increases from ~$0.75/hr to ~$2.20/hr ## 8. Monitoring - **Logs**: Check Space logs for errors - **TensorBoard**: Each job provides a TensorBoard URL - **LangSmith**: Client example includes `@traceable` for observability ## 9. Production Considerations - **Persistence**: Jobs stored in `/tmp` - lost on restart. Use persistent storage or external DB for production - **Queuing**: Current version runs jobs sequentially. Add job queue (Celery/Redis) for concurrent training - **Authentication**: Add API key auth for production use - **Rate Limiting**: Add rate limits to prevent abuse - **Monitoring**: Set up alerts for failed jobs ## 10. Cost Optimization - **Auto-scaling**: Set Space to sleep after inactivity - **Demo mode**: Always test with demo mode first ($0.50 vs $15) - **Batch jobs**: Train multiple agents in sequence to maximize GPU utilization - **Local development**: Test locally before deploying to Space ## Troubleshooting **Space won't build**: - Check requirements.txt versions - Verify Python version compatibility (3.9+ recommended) - Check Space logs for build errors **Training fails**: - Verify HF_TOKEN has write permissions - Check dataset format (must have `text` and `response` columns) - Ensure model repo exists on HuggingFace Hub **Out of memory**: - Enable demo mode to test with smaller dataset - Use quantization: `int4` or `int8` - Upgrade to larger GPU (`a10g-large`) - Reduce `max_seq_length` in config **Connection timeout**: - Space may be sleeping - first request wakes it (30s delay) - Increase client timeout to 60s for first request