[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"similar-exo-explore--exo":3,"tool-exo-explore--exo":64},[4,17,27,35,43,56],{"id":5,"name":6,"github_repo":7,"description_zh":8,"stars":9,"difficulty_score":10,"last_commit_at":11,"category_tags":12,"status":16},3808,"stable-diffusion-webui","AUTOMATIC1111\u002Fstable-diffusion-webui","stable-diffusion-webui 是一个基于 Gradio 构建的网页版操作界面，旨在让用户能够轻松地在本地运行和使用强大的 Stable Diffusion 图像生成模型。它解决了原始模型依赖命令行、操作门槛高且功能分散的痛点，将复杂的 AI 绘图流程整合进一个直观易用的图形化平台。\n\n无论是希望快速上手的普通创作者、需要精细控制画面细节的设计师，还是想要深入探索模型潜力的开发者与研究人员，都能从中获益。其核心亮点在于极高的功能丰富度：不仅支持文生图、图生图、局部重绘（Inpainting）和外绘（Outpainting）等基础模式，还独创了注意力机制调整、提示词矩阵、负向提示词以及“高清修复”等高级功能。此外，它内置了 GFPGAN 和 CodeFormer 等人脸修复工具，支持多种神经网络放大算法，并允许用户通过插件系统无限扩展能力。即使是显存有限的设备，stable-diffusion-webui 也提供了相应的优化选项，让高质量的 AI 艺术创作变得触手可及。",162132,3,"2026-04-05T11:01:52",[13,14,15],"开发框架","图像","Agent","ready",{"id":18,"name":19,"github_repo":20,"description_zh":21,"stars":22,"difficulty_score":23,"last_commit_at":24,"category_tags":25,"status":16},1381,"everything-claude-code","affaan-m\u002Feverything-claude-code","everything-claude-code 是一套专为 AI 编程助手（如 Claude Code、Codex、Cursor 等）打造的高性能优化系统。它不仅仅是一组配置文件，而是一个经过长期实战打磨的完整框架，旨在解决 AI 代理在实际开发中面临的效率低下、记忆丢失、安全隐患及缺乏持续学习能力等核心痛点。\n\n通过引入技能模块化、直觉增强、记忆持久化机制以及内置的安全扫描功能，everything-claude-code 能显著提升 AI 在复杂任务中的表现，帮助开发者构建更稳定、更智能的生产级 AI 代理。其独特的“研究优先”开发理念和针对 Token 消耗的优化策略，使得模型响应更快、成本更低，同时有效防御潜在的攻击向量。\n\n这套工具特别适合软件开发者、AI 研究人员以及希望深度定制 AI 工作流的技术团队使用。无论您是在构建大型代码库，还是需要 AI 协助进行安全审计与自动化测试，everything-claude-code 都能提供强大的底层支持。作为一个曾荣获 Anthropic 黑客大奖的开源项目，它融合了多语言支持与丰富的实战钩子（hooks），让 AI 真正成长为懂上",138956,2,"2026-04-05T11:33:21",[13,15,26],"语言模型",{"id":28,"name":29,"github_repo":30,"description_zh":31,"stars":32,"difficulty_score":23,"last_commit_at":33,"category_tags":34,"status":16},2271,"ComfyUI","Comfy-Org\u002FComfyUI","ComfyUI 是一款功能强大且高度模块化的视觉 AI 引擎，专为设计和执行复杂的 Stable Diffusion 图像生成流程而打造。它摒弃了传统的代码编写模式，采用直观的节点式流程图界面，让用户通过连接不同的功能模块即可构建个性化的生成管线。\n\n这一设计巧妙解决了高级 AI 绘图工作流配置复杂、灵活性不足的痛点。用户无需具备编程背景，也能自由组合模型、调整参数并实时预览效果，轻松实现从基础文生图到多步骤高清修复等各类复杂任务。ComfyUI 拥有极佳的兼容性，不仅支持 Windows、macOS 和 Linux 全平台，还广泛适配 NVIDIA、AMD、Intel 及苹果 Silicon 等多种硬件架构，并率先支持 SDXL、Flux、SD3 等前沿模型。\n\n无论是希望深入探索算法潜力的研究人员和开发者，还是追求极致创作自由度的设计师与资深 AI 绘画爱好者，ComfyUI 都能提供强大的支持。其独特的模块化架构允许社区不断扩展新功能，使其成为当前最灵活、生态最丰富的开源扩散模型工具之一，帮助用户将创意高效转化为现实。",107662,"2026-04-03T11:11:01",[13,14,15],{"id":36,"name":37,"github_repo":38,"description_zh":39,"stars":40,"difficulty_score":23,"last_commit_at":41,"category_tags":42,"status":16},3704,"NextChat","ChatGPTNextWeb\u002FNextChat","NextChat 是一款轻量且极速的 AI 助手，旨在为用户提供流畅、跨平台的大模型交互体验。它完美解决了用户在多设备间切换时难以保持对话连续性，以及面对众多 AI 模型不知如何统一管理的痛点。无论是日常办公、学习辅助还是创意激发，NextChat 都能让用户随时随地通过网页、iOS、Android、Windows、MacOS 或 Linux 端无缝接入智能服务。\n\n这款工具非常适合普通用户、学生、职场人士以及需要私有化部署的企业团队使用。对于开发者而言，它也提供了便捷的自托管方案，支持一键部署到 Vercel 或 Zeabur 等平台。\n\nNextChat 的核心亮点在于其广泛的模型兼容性，原生支持 Claude、DeepSeek、GPT-4 及 Gemini Pro 等主流大模型，让用户在一个界面即可自由切换不同 AI 能力。此外，它还率先支持 MCP（Model Context Protocol）协议，增强了上下文处理能力。针对企业用户，NextChat 提供专业版解决方案，具备品牌定制、细粒度权限控制、内部知识库整合及安全审计等功能，满足公司对数据隐私和个性化管理的高标准要求。",87618,"2026-04-05T07:20:52",[13,26],{"id":44,"name":45,"github_repo":46,"description_zh":47,"stars":48,"difficulty_score":23,"last_commit_at":49,"category_tags":50,"status":16},2268,"ML-For-Beginners","microsoft\u002FML-For-Beginners","ML-For-Beginners 是由微软推出的一套系统化机器学习入门课程，旨在帮助零基础用户轻松掌握经典机器学习知识。这套课程将学习路径规划为 12 周，包含 26 节精炼课程和 52 道配套测验，内容涵盖从基础概念到实际应用的完整流程，有效解决了初学者面对庞大知识体系时无从下手、缺乏结构化指导的痛点。\n\n无论是希望转型的开发者、需要补充算法背景的研究人员，还是对人工智能充满好奇的普通爱好者，都能从中受益。课程不仅提供了清晰的理论讲解，还强调动手实践，让用户在循序渐进中建立扎实的技能基础。其独特的亮点在于强大的多语言支持，通过自动化机制提供了包括简体中文在内的 50 多种语言版本，极大地降低了全球不同背景用户的学习门槛。此外，项目采用开源协作模式，社区活跃且内容持续更新，确保学习者能获取前沿且准确的技术资讯。如果你正寻找一条清晰、友好且专业的机器学习入门之路，ML-For-Beginners 将是理想的起点。",84991,"2026-04-05T10:45:23",[14,51,52,53,15,54,26,13,55],"数据工具","视频","插件","其他","音频",{"id":57,"name":58,"github_repo":59,"description_zh":60,"stars":61,"difficulty_score":10,"last_commit_at":62,"category_tags":63,"status":16},3128,"ragflow","infiniflow\u002Fragflow","RAGFlow 是一款领先的开源检索增强生成（RAG）引擎，旨在为大语言模型构建更精准、可靠的上下文层。它巧妙地将前沿的 RAG 技术与智能体（Agent）能力相结合，不仅支持从各类文档中高效提取知识，还能让模型基于这些知识进行逻辑推理和任务执行。\n\n在大模型应用中，幻觉问题和知识滞后是常见痛点。RAGFlow 通过深度解析复杂文档结构（如表格、图表及混合排版），显著提升了信息检索的准确度，从而有效减少模型“胡编乱造”的现象，确保回答既有据可依又具备时效性。其内置的智能体机制更进一步，使系统不仅能回答问题，还能自主规划步骤解决复杂问题。\n\n这款工具特别适合开发者、企业技术团队以及 AI 研究人员使用。无论是希望快速搭建私有知识库问答系统，还是致力于探索大模型在垂直领域落地的创新者，都能从中受益。RAGFlow 提供了可视化的工作流编排界面和灵活的 API 接口，既降低了非算法背景用户的上手门槛，也满足了专业开发者对系统深度定制的需求。作为基于 Apache 2.0 协议开源的项目，它正成为连接通用大模型与行业专有知识之间的重要桥梁。",77062,"2026-04-04T04:44:48",[15,14,13,26,54],{"id":65,"github_repo":66,"name":67,"description_en":68,"description_zh":69,"ai_summary_zh":70,"readme_en":71,"readme_zh":72,"quickstart_zh":73,"use_case_zh":74,"hero_image_url":75,"owner_login":76,"owner_name":77,"owner_avatar_url":78,"owner_bio":79,"owner_company":80,"owner_location":80,"owner_email":81,"owner_twitter":82,"owner_website":80,"owner_url":83,"languages":84,"stars":124,"forks":125,"last_commit_at":126,"license":127,"difficulty_score":100,"env_os":128,"env_gpu":129,"env_ram":130,"env_deps":131,"category_tags":140,"github_topics":80,"view_count":141,"oss_zip_url":80,"oss_zip_packed_at":80,"status":16,"created_at":142,"updated_at":143,"faqs":144,"releases":184},973,"exo-explore\u002Fexo","exo","Run frontier AI locally.","exo 是一款让你能在本地运行前沿大模型的开源工具，由 exo labs 维护。它的核心思路很简单：把家里或办公室里的多台设备（Mac、PC 等）连成一个人工智能计算集群，让原本单台机器跑不动的大模型也能流畅运行。\n\n传统上，运行 671B 参数的 DeepSeek 或 235B 的 Qwen3 需要昂贵的专业 GPU 服务器。exo 解决了这个门槛问题——通过自动发现网络中的设备、智能分配计算任务，甚至利用 Thunderbolt 5 的 RDMA 技术将设备间延迟降低 99%，实现\"加设备就提速\"的效果。实测显示，两台设备可获得 1.8 倍加速，四台设备可达 3.2 倍。\n\nexo 内置可视化仪表盘管理集群，并兼容 OpenAI、Claude、Ollama 等主流 API 格式，现有工具可直接接入。支持从 HuggingFace 加载自定义模型，后端基于苹果 MLX 框架优化。\n\n这款工具特别适合 AI 研究者、开发者以及对数据隐私有要求的团队——无需将敏感数据上传云端，即可在本地体验顶尖大模型。如果你手头有多台 Mac 或混合设备，想榨干它们的联合算力，exo 是目前最成熟的解决","exo 是一款让你能在本地运行前沿大模型的开源工具，由 exo labs 维护。它的核心思路很简单：把家里或办公室里的多台设备（Mac、PC 等）连成一个人工智能计算集群，让原本单台机器跑不动的大模型也能流畅运行。\n\n传统上，运行 671B 参数的 DeepSeek 或 235B 的 Qwen3 需要昂贵的专业 GPU 服务器。exo 解决了这个门槛问题——通过自动发现网络中的设备、智能分配计算任务，甚至利用 Thunderbolt 5 的 RDMA 技术将设备间延迟降低 99%，实现\"加设备就提速\"的效果。实测显示，两台设备可获得 1.8 倍加速，四台设备可达 3.2 倍。\n\nexo 内置可视化仪表盘管理集群，并兼容 OpenAI、Claude、Ollama 等主流 API 格式，现有工具可直接接入。支持从 HuggingFace 加载自定义模型，后端基于苹果 MLX 框架优化。\n\n这款工具特别适合 AI 研究者、开发者以及对数据隐私有要求的团队——无需将敏感数据上传云端，即可在本地体验顶尖大模型。如果你手头有多台 Mac 或混合设备，想榨干它们的联合算力，exo 是目前最成熟的解决方案之一。","\u003Cdiv align=\"center\">\n\n\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"\u002Fdocs\u002Fimgs\u002Fexo-logo-black-bg.jpg\">\n  \u003Cimg alt=\"exo logo\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_01b135907dc6.png\" width=\"50%\" height=\"50%\">\n\u003C\u002Fpicture>\n\nexo: Run frontier AI locally. Maintained by [exo labs](https:\u002F\u002Fx.com\u002Fexolabs).\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FTJ4P57arEm\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Join%20Server-5865F2?logo=discord&logoColor=white\" alt=\"Discord\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fx.com\u002Fexolabs\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fexolabs?style=social\" alt=\"X\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0.html\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache2.0-blue.svg\" alt=\"License: Apache-2.0\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n\n---\n\nexo connects all your devices into an AI cluster. Not only does exo enable running models larger than would fit on a single device, but with [day-0 support for RDMA over Thunderbolt](https:\u002F\u002Fx.com\u002Fexolabs\u002Fstatus\u002F2001817749744476256?s=20), makes models run faster as you add more devices.\n\n## Features\n\n- **Automatic Device Discovery**: Devices running exo automatically discover each other - no manual configuration.\n- **RDMA over Thunderbolt**: exo ships with [day-0 support for RDMA over Thunderbolt 5](https:\u002F\u002Fx.com\u002Fexolabs\u002Fstatus\u002F2001817749744476256?s=20), enabling 99% reduction in latency between devices.\n- **Topology-Aware Auto Parallel**: exo figures out the best way to split your model across all available devices based on a realtime view of your device topology. It takes into account device resources and network latency\u002Fbandwidth between each link.\n- **Tensor Parallelism**: exo supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.\n- **MLX Support**: exo uses [MLX](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx) as an inference backend and [MLX distributed](https:\u002F\u002Fml-explore.github.io\u002Fmlx\u002Fbuild\u002Fhtml\u002Fusage\u002Fdistributed.html) for distributed communication.\n- **Multiple API Compatibility**: Compatible with OpenAI Chat Completions API, Claude Messages API, OpenAI Responses API, and Ollama API - use your existing tools and clients.\n- **Custom Model Support**: Load custom models from HuggingFace hub to expand the range of available models.\n\n## Dashboard\n\nexo includes a built-in dashboard for managing your cluster and chatting with models.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_d29a941f829f.png\" alt=\"exo dashboard - cluster view showing 4 x M3 Ultra Mac Studio with DeepSeek v3.1 and Kimi-K2-Thinking loaded\" width=\"80%\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Cem>4 × 512GB M3 Ultra Mac Studio running DeepSeek v3.1 (8-bit) and Kimi-K2-Thinking (4-bit)\u003C\u002Fem>\u003C\u002Fp>\n\n## Benchmarks\n\n\u003Cdetails>\n  \u003Csummary>Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_0ce4dd68da34.jpeg\" alt=\"Benchmark - Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>Source:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_d50de6d43c67.jpeg\" alt=\"Benchmark - DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>Source:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_a42c066b6629.jpeg\" alt=\"Benchmark - Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>Source:\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n---\n\n## Quick Start\n\nDevices running exo automatically discover each other, without needing any manual configuration. Each device provides an API and a dashboard for interacting with your cluster (runs at `http:\u002F\u002Flocalhost:52415`).\n\nThere are two ways to run exo:\n\n### Run from Source (macOS)\n\nIf you have [Nix](https:\u002F\u002Fnixos.org\u002F) installed, you can skip most of the steps below and run exo directly:\n\n```bash\nnix run .#exo\n```\n\n**Note:** To accept the Cachix binary cache (and avoid the Xcode Metal ToolChain), add to `\u002Fetc\u002Fnix\u002Fnix.conf`:\n```\ntrusted-users = root    (or your username)\nexperimental-features = nix-command flakes\n```\nThen restart the Nix daemon: `sudo launchctl kickstart -k system\u002Forg.nixos.nix-daemon`\n\n**Prerequisites:**\n- [Xcode](https:\u002F\u002Fdeveloper.apple.com\u002Fxcode\u002F) (provides the Metal ToolChain required for MLX compilation)\n- [brew](https:\u002F\u002Fgithub.com\u002FHomebrew\u002Fbrew) (for simple package management on macOS)\n\n  ```bash\n  \u002Fbin\u002Fbash -c \"$(curl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002FHomebrew\u002Finstall\u002FHEAD\u002Finstall.sh)\"\n  ```\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv) (for Python dependency management)\n- [node](https:\u002F\u002Fgithub.com\u002Fnodejs\u002Fnode) (for building the dashboard)\n\n  ```bash\n  brew install uv node\n  ```\n- [rust](https:\u002F\u002Fgithub.com\u002Frust-lang\u002Frustup) (to build Rust bindings, nightly for now)\n\n  ```bash\n  curl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\n  rustup toolchain install nightly\n  ```\n- [macmon](https:\u002F\u002Fgithub.com\u002Fvladkens\u002Fmacmon) (for hardware monitoring on Apple Silicon)\n\n  Install the pinned fork revision used by this repo instead of Homebrew `macmon`.\n  Homebrew `macmon 0.6.1` still crashes on Apple M5.\n\n  ```bash\n  cargo install --git https:\u002F\u002Fgithub.com\u002Fswiftraccoon\u002Fmacmon \\\n    --rev 9154d234f763fbeffdcb4135d0bbbaf80609699b \\\n    macmon \\\n    --force\n  ```\n\nClone the repo, build the dashboard, and run exo:\n\n```bash\n# Clone exo\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\n\n# Build dashboard\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# Run exo\nuv run exo\n```\n\nThis starts the exo dashboard and API at http:\u002F\u002Flocalhost:52415\u002F\n\n\n*Please view the section on RDMA to enable this feature on MacOS >=26.2!*\n\n\n### Run from Source (Linux)\n\n**Prerequisites:**\n\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv) (for Python dependency management)\n- [node](https:\u002F\u002Fgithub.com\u002Fnodejs\u002Fnode) (for building the dashboard) - version 18 or higher\n- [rust](https:\u002F\u002Fgithub.com\u002Frust-lang\u002Frustup) (to build Rust bindings, nightly for now)\n\n**Installation methods:**\n\n**Option 1: Using system package manager (Ubuntu\u002FDebian example):**\n```bash\n# Install Node.js and npm\nsudo apt update\nsudo apt install nodejs npm\n\n# Install uv\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n\n# Install Rust (using rustup)\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**Option 2: Using Homebrew on Linux (if preferred):**\n```bash\n# Install Homebrew on Linux\n\u002Fbin\u002Fbash -c \"$(curl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002FHomebrew\u002Finstall\u002FHEAD\u002Finstall.sh)\"\n\n# Install dependencies\nbrew install uv node\n\n# Install Rust (using rustup)\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**Note:** The `macmon` package is macOS-only and not required for Linux.\n\nClone the repo, build the dashboard, and run exo:\n\n```bash\n# Clone exo\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\n\n# Build dashboard\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# Run exo\nuv run exo\n```\n\nThis starts the exo dashboard and API at http:\u002F\u002Flocalhost:52415\u002F\n\n**Important note for Linux users:** Currently, exo runs on CPU on Linux. GPU support for Linux platforms is under development. If you'd like to see support for your specific Linux hardware, please [search for existing feature requests](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues) or create a new one.\n\n**Configuration Options:**\n\n- `--no-worker`: Run exo without the worker component. Useful for coordinator-only nodes that handle networking and orchestration but don't execute inference tasks. This is helpful for machines without sufficient GPU resources but with good network connectivity.\n\n  ```bash\n  uv run exo --no-worker\n  ```\n\n**File Locations (Linux):**\n\nexo follows the [XDG Base Directory Specification](https:\u002F\u002Fspecifications.freedesktop.org\u002Fbasedir-spec\u002Fbasedir-spec-latest.html) on Linux:\n\n- **Configuration files**: `~\u002F.config\u002Fexo\u002F` (or `$XDG_CONFIG_HOME\u002Fexo\u002F`)\n- **Data files**: `~\u002F.local\u002Fshare\u002Fexo\u002F` (or `$XDG_DATA_HOME\u002Fexo\u002F`)\n- **Cache files**: `~\u002F.cache\u002Fexo\u002F` (or `$XDG_CACHE_HOME\u002Fexo\u002F`)\n- **Log files**: `~\u002F.cache\u002Fexo\u002Fexo_log\u002F` (with automatic log rotation)\n- **Custom model cards**: `~\u002F.local\u002Fshare\u002Fexo\u002Fcustom_model_cards\u002F`\n\nYou can override these locations by setting the corresponding XDG environment variables.\n\n### macOS App\n\nexo ships a macOS app that runs in the background on your Mac.\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_ac9f42a812e2.png\" alt=\"exo macOS App - running on a MacBook\" width=\"35%\" \u002F>\n\nThe macOS app requires macOS Tahoe 26.2 or later.\n\nDownload the latest build here: [EXO-latest.dmg](https:\u002F\u002Fassets.exolabs.net\u002FEXO-latest.dmg).\n\nThe app will ask for permission to modify system settings and install a new Network profile. Improvements to this are being worked on.\n\n**Custom Namespace for Cluster Isolation:**\n\nThe macOS app includes a custom namespace feature that allows you to isolate your exo cluster from others on the same network. This is configured through the `EXO_LIBP2P_NAMESPACE` setting:\n\n- **Use cases**:\n  - Running multiple separate exo clusters on the same network\n  - Isolating development\u002Ftesting clusters from production clusters\n  - Preventing accidental cluster joining\n\n- **Configuration**: Access this setting in the app's Advanced settings (or set the `EXO_LIBP2P_NAMESPACE` environment variable when running from source)\n\nThe namespace is logged on startup for debugging purposes.\n\n#### Uninstalling the macOS App\n\nThe recommended way to uninstall is through the app itself: click the menu bar icon → Advanced → Uninstall. This cleanly removes all system components.\n\nIf you've already deleted the app, you can run the standalone uninstaller script:\n\n```bash\nsudo .\u002Fapp\u002FEXO\u002Funinstall-exo.sh\n```\n\nThis removes:\n- Network setup LaunchDaemon\n- Network configuration script\n- Log files\n- The \"exo\" network location\n\n**Note:** You'll need to manually remove EXO from Login Items in System Settings → General → Login Items.\n\n---\n\n### Enabling RDMA on macOS\n\nRDMA is a new capability added to macOS 26.2. It works on any Mac with Thunderbolt 5 (M4 Pro Mac Mini, M4 Max Mac Studio, M4 Max MacBook Pro, M3 Ultra Mac Studio).\n\nPlease refer to the caveats for immediate troubleshooting.\n\nTo enable RDMA on macOS, follow these steps:\n\n1. Shut down your Mac.\n2. Hold down the power button for 10 seconds until the boot menu appears.\n3. Select \"Options\" to enter Recovery mode.\n4. When the Recovery UI appears, open the Terminal from the Utilities menu.\n5. In the Terminal, type:\n   ```\n   rdma_ctl enable\n   ```\n   and press Enter.\n6. Reboot your Mac.\n\nAfter that, RDMA will be enabled in macOS and exo will take care of the rest.\n\n**Important Caveats**\n\n1. Devices that wish to be part of an RDMA cluster must be connected to all other devices in the cluster.\n2. The cables must support TB5.\n3. On a Mac Studio, you cannot use the Thunderbolt 5 port next to the Ethernet port.\n4. If running from source, please use the script found at `tmp\u002Fset_rdma_network_config.sh`, which will disable Thunderbolt Bridge and set dhcp on each RDMA port.\n5. RDMA ports may be unable to discover each other on different versions of MacOS. Please ensure that OS versions match exactly (even beta version numbers) on all devices.\n\n---\n\n## Environment Variables\n\nexo supports several environment variables for configuration:\n\n| Variable | Description | Default |\n|----------|-------------|---------|\n| `EXO_DEFAULT_MODELS_DIR` | Default directory for model downloads and caches. Always first in the writable dirs list. | `~\u002F.local\u002Fshare\u002Fexo\u002Fmodels` (Linux) or `~\u002F.exo\u002Fmodels` (macOS) |\n| `EXO_MODELS_DIRS` | Colon-separated additional writable directories for model downloads. Checked in order after the default; first with enough free space is used. | None |\n| `EXO_MODELS_READ_ONLY_DIRS` | Colon-separated read-only directories to search for pre-downloaded models (e.g., NFS mounts, shared storage). Models here cannot be deleted. | None |\n| `EXO_OFFLINE` | Run without internet connection (uses only local models) | `false` |\n| `EXO_ENABLE_IMAGE_MODELS` | Enable image model support | `false` |\n| `EXO_LIBP2P_NAMESPACE` | Custom namespace for cluster isolation | None |\n| `EXO_FAST_SYNCH` | Control MLX_METAL_FAST_SYNCH behavior (for JACCL backend) | Auto |\n| `EXO_TRACING_ENABLED` | Enable distributed tracing for performance analysis | `false` |\n\n**Example usage:**\n\n```bash\n# Use pre-downloaded models from NFS mount (read-only)\nEXO_MODELS_READ_ONLY_DIRS=\u002Fmnt\u002Fnfs\u002Fmodels:\u002Fopt\u002Fai-models uv run exo\n\n# Download models to an external SSD (falls back to default dir if full)\nEXO_MODELS_DIRS=\u002FVolumes\u002FExternalSSD\u002Fexo-models uv run exo\n\n# Run in offline mode\nEXO_OFFLINE=true uv run exo\n\n# Enable image models\nEXO_ENABLE_IMAGE_MODELS=true uv run exo\n\n# Use custom namespace for cluster isolation\nEXO_LIBP2P_NAMESPACE=my-dev-cluster uv run exo\n```\n\n---\n\n### Using the API\n\nexo provides multiple API-compatible interfaces for maximum compatibility with existing tools:\n\n- **OpenAI Chat Completions API** - Compatible with OpenAI clients\n- **Claude Messages API** - Compatible with Anthropic's Claude format\n- **OpenAI Responses API** - Compatible with OpenAI's Responses format\n- **Ollama API** - Compatible with Ollama and tools like OpenWebUI\n\nIf you prefer to interact with exo via the API, here is an example creating an instance of a small model (`mlx-community\u002FLlama-3.2-1B-Instruct-4bit`), sending a chat completions request and deleting the instance.\n\n---\n\n**1. Preview instance placements**\n\nThe `\u002Finstance\u002Fpreviews` endpoint will preview all valid placements for your model.\n\n```bash\ncurl \"http:\u002F\u002Flocalhost:52415\u002Finstance\u002Fpreviews?model_id=llama-3.2-1b\"\n```\n\nSample response:\n\n```json\n{\n  \"previews\": [\n    {\n      \"model_id\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n      \"sharding\": \"Pipeline\",\n      \"instance_meta\": \"MlxRing\",\n      \"instance\": {...},\n      \"memory_delta_by_node\": {\"local\": 729808896},\n      \"error\": null\n    }\n    \u002F\u002F ...possibly more placements...\n  ]\n}\n```\n\nThis will return all valid placements for this model. Pick a placement that you like.\nTo pick the first one, pipe into `jq`:\n\n```bash\ncurl \"http:\u002F\u002Flocalhost:52415\u002Finstance\u002Fpreviews?model_id=llama-3.2-1b\" | jq -c '.previews[] | select(.error == null) | .instance' | head -n1\n```\n\n---\n\n**2. Create a model instance**\n\nSend a POST to `\u002Finstance` with your desired placement in the `instance` field (the full payload must match types as in `CreateInstanceParams`), which you can copy from step 1:\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Finstance \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"instance\": {...}\n  }'\n```\n\n\nSample response:\n\n```json\n{\n  \"message\": \"Command received.\",\n  \"command_id\": \"e9d1a8ab-....\"\n}\n```\n\n---\n\n**3. Send a chat completion**\n\nNow, make a POST to `\u002Fv1\u002Fchat\u002Fcompletions` (the same format as OpenAI's API):\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"What is Llama 3.2 1B?\"}\n    ],\n    \"stream\": true\n  }'\n```\n\n---\n\n**4. Delete the instance**\n\nWhen you're done, delete the instance by its ID (find it via `\u002Fstate` or `\u002Finstance` endpoints):\n\n```bash\ncurl -X DELETE http:\u002F\u002Flocalhost:52415\u002Finstance\u002FYOUR_INSTANCE_ID\n```\n\n### Claude Messages API Compatibility\n\nUse the Claude Messages API format with the `\u002Fv1\u002Fmessages` endpoint:\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fmessages \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"max_tokens\": 1024,\n    \"stream\": true\n  }'\n```\n\n### OpenAI Responses API Compatibility\n\nUse the OpenAI Responses API format with the `\u002Fv1\u002Fresponses` endpoint:\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fresponses \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"stream\": true\n  }'\n```\n\n### Ollama API Compatibility\n\nexo supports Ollama API endpoints for compatibility with tools like OpenWebUI:\n\n```bash\n# Ollama chat\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Follama\u002Fapi\u002Fchat \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"stream\": false\n  }'\n\n# List models (Ollama format)\ncurl http:\u002F\u002Flocalhost:52415\u002Follama\u002Fapi\u002Ftags\n```\n\n### Custom Model Loading from HuggingFace\n\nYou can add custom models from the HuggingFace hub:\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Fmodels\u002Fadd \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model_id\": \"mlx-community\u002Fmy-custom-model\"\n  }'\n```\n\n**Security Note:**\n\nCustom models requiring `trust_remote_code` in their configuration must be explicitly enabled (default is false) for security. Only enable this if you trust the model's remote code execution. Models are fetched from HuggingFace and stored locally as custom model cards.\n\n**Other useful API endpoints*:**\n\n- List all models: `curl http:\u002F\u002Flocalhost:52415\u002Fmodels`\n- List downloaded models only: `curl http:\u002F\u002Flocalhost:52415\u002Fmodels?status=downloaded`\n- Search HuggingFace: `curl \"http:\u002F\u002Flocalhost:52415\u002Fmodels\u002Fsearch?query=llama&limit=10\"`\n- Inspect instance IDs and deployment state: `curl http:\u002F\u002Flocalhost:52415\u002Fstate`\n\nFor further details, see:\n\n- API documentation in [docs\u002Fapi.md](docs\u002Fapi.md).\n- API types and endpoints in [src\u002Fexo\u002Fmaster\u002Fapi.py](src\u002Fexo\u002Fmaster\u002Fapi.py).\n\n---\n\n## Benchmarking\n\nThe `exo-bench` tool measures model prefill and token generation speed across different placement configurations. This helps you optimize model performance and validate improvements.\n\n**Prerequisites:**\n- Nodes should be running with `uv run exo` before benchmarking\n- The tool uses the `\u002Fbench\u002Fchat\u002Fcompletions` endpoint\n\n**Basic usage:**\n\n```bash\nuv run bench\u002Fexo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,256,512 \\\n  --tg 128,256\n```\n\n**Key parameters:**\n\n- `--model`: Model to benchmark (short ID or HuggingFace ID)\n- `--pp`: Prompt size hints (comma-separated integers)\n- `--tg`: Generation lengths (comma-separated integers)\n- `--max-nodes`: Limit placements to N nodes (default: 4)\n- `--instance-meta`: Filter by `ring`, `jaccl`, or `both` (default: both)\n- `--sharding`: Filter by `pipeline`, `tensor`, or `both` (default: both)\n- `--repeat`: Number of repetitions per configuration (default: 1)\n- `--warmup`: Warmup runs per placement (default: 0)\n- `--json-out`: Output file for results (default: bench\u002Fresults.json)\n\n**Example with filters:**\n\n```bash\nuv run bench\u002Fexo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,512 \\\n  --tg 128 \\\n  --max-nodes 2 \\\n  --sharding tensor \\\n  --repeat 3 \\\n  --json-out my-results.json\n```\n\nThe tool outputs performance metrics including prompt tokens per second (prompt_tps), generation tokens per second (generation_tps), and peak memory usage for each configuration.\n\n---\n\n## Hardware Accelerator Support\n\nOn macOS, exo uses the GPU. On Linux, exo currently runs on CPU. We are working on extending hardware accelerator support. If you'd like support for a new hardware platform, please [search for an existing feature request](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues) and add a thumbs up so we know what hardware is important to the community.\n\n---\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to exo.\n","\u003Cdiv align=\"center\">\n\n\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"\u002Fdocs\u002Fimgs\u002Fexo-logo-black-bg.jpg\">\n  \u003Cimg alt=\"exo logo\" src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_01b135907dc6.png\" width=\"50%\" height=\"50%\">\n\u003C\u002Fpicture>\n\nexo：在本地运行前沿 AI。由 [exo labs](https:\u002F\u002Fx.com\u002Fexolabs) 维护。\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FTJ4P57arEm\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Join%20Server-5865F2?logo=discord&logoColor=white\" alt=\"Discord\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fx.com\u002Fexolabs\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fexolabs?style=social\" alt=\"X\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0.html\" target=\"_blank\" rel=\"noopener noreferrer\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache2.0-blue.svg\" alt=\"License: Apache-2.0\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n\n---\n\nexo 将你的所有设备连接成一个 AI 集群。exo 不仅能够运行超出单台设备容量的大型模型，而且通过 [对 Thunderbolt RDMA 的 day-0 支持](https:\u002F\u002Fx.com\u002Fexolabs\u002Fstatus\u002F2001817749744476256?s=20)，让你在添加更多设备时模型运行速度更快。\n\n## 功能特性\n\n- **自动设备发现**：运行 exo 的设备会自动发现彼此，无需手动配置。\n- **Thunderbolt RDMA**：exo 内置 [对 Thunderbolt 5 RDMA 的 day-0 支持](https:\u002F\u002Fx.com\u002Fexolabs\u002Fstatus\u002F2001817744476256?s=20)，可将设备间延迟降低 99%。\n- **拓扑感知自动并行**：exo 会根据设备拓扑的实时视图，自动找出在所有可用设备间拆分模型的最佳方式。它会考虑设备资源以及每条链路之间的网络延迟\u002F带宽。\n- **张量并行（Tensor Parallelism）**：exo 支持模型分片（sharding），在 2 台设备上可获得最高 1.8 倍加速，在 4 台设备上可获得最高 3.2 倍加速。\n- **MLX 支持**：exo 使用 [MLX](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx) 作为推理后端，并使用 [MLX distributed](https:\u002F\u002Fml-explore.github.io\u002Fmlx\u002Fbuild\u002Fhtml\u002Fusage\u002Fdistributed.html) 进行分布式通信。\n- **多 API 兼容**：兼容 OpenAI Chat Completions API、Claude Messages API、OpenAI Responses API 和 Ollama API——可直接使用你现有的工具和客户端。\n- **自定义模型支持**：从 HuggingFace Hub 加载自定义模型，扩展可用模型的范围。\n\n## 仪表盘\n\nexo 包含一个内置仪表盘，用于管理集群和与模型对话。\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_d29a941f829f.png\" alt=\"exo dashboard - cluster view showing 4 x M3 Ultra Mac Studio with DeepSeek v3.1 and Kimi-K2-Thinking loaded\" width=\"80%\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Cem>4 × 512GB M3 Ultra Mac Studio 运行 DeepSeek v3.1（8-bit）和 Kimi-K2-Thinking（4-bit）\u003C\u002Fem>\u003C\u002Fp>\n\n## 基准测试\n\n\u003Cdetails>\n  \u003Csummary>Qwen3-235B（8-bit）在 4 × M3 Ultra Mac Studio 上使用张量并行 RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_0ce4dd68da34.jpeg\" alt=\"Benchmark - Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>来源：\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling：Mac Studio 上的 15 TB VRAM —— Thunderbolt 5 RDMA\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>DeepSeek v3.1 671B（8-bit）在 4 × M3 Ultra Mac Studio 上使用张量并行 RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_d50de6d43c67.jpeg\" alt=\"Benchmark - DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>来源：\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling：Mac Studio 上的 15 TB VRAM —— Thunderbolt 5 RDMA\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>Kimi K2 Thinking（原生 4-bit）在 4 × M3 Ultra Mac Studio 上使用张量并行 RDMA\u003C\u002Fsummary>\n  \u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_a42c066b6629.jpeg\" alt=\"Benchmark - Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" \u002F>\n  \u003Cp>\n    \u003Cstrong>来源：\u003C\u002Fstrong> \u003Ca href=\"https:\u002F\u002Fwww.jeffgeerling.com\u002Fblog\u002F2025\u002F15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\">Jeff Geerling：Mac Studio 上的 15 TB VRAM —— Thunderbolt 5 RDMA\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdetails>\n\n---\n\n## 快速开始\n\n运行 exo 的设备会自动发现彼此，无需任何手动配置。每台设备都提供 API 和仪表盘用于与集群交互（运行在 `http:\u002F\u002Flocalhost:52415`）。\n\n有两种方式运行 exo：\n\n### 从源码运行（macOS）\n\n如果你已安装 [Nix](https:\u002F\u002Fnixos.org\u002F)，可以跳过以下大部分步骤直接运行 exo：\n\n```bash\nnix run .#exo\n```\n\n**注意：** 要接受 Cachix 二进制缓存（并避免使用 Xcode Metal ToolChain），请在 `\u002Fetc\u002Fnix\u002Fnix.conf` 中添加：\n```\ntrusted-users = root    （或你的用户名）\nexperimental-features = nix-command flakes\n```\n然后重启 Nix 守护进程：`sudo launchctl kickstart -k system\u002Forg.nixos.nix-daemon`\n\n**前置要求：**\n- [Xcode](https:\u002F\u002Fdeveloper.apple.com\u002Fxcode\u002F)（提供 MLX 编译所需的 Metal ToolChain）\n- [brew](https:\u002F\u002Fgithub.com\u002FHomebrew\u002Fbrew)（用于 macOS 的简单包管理）\n\n  ```bash\n  \u002Fbin\u002Fbash -c \"$(curl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002FHomebrew\u002Finstall\u002FHEAD\u002Finstall.sh)\"\n  ```\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv)（用于 Python 依赖管理）\n- [node](https:\u002F\u002Fgithub.com\u002Fnodejs\u002Fnode)（用于构建仪表盘）\n\n  ```bash\n  brew install uv node\n  ```\n- [rust](https:\u002F\u002Fgithub.com\u002Frust-lang\u002Frustup)（用于构建 Rust 绑定，目前需要 nightly 版本）\n\n  ```bash\n  curl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\n  rustup toolchain install nightly\n  ```\n- [macmon](https:\u002F\u002Fgithub.com\u002Fvladkens\u002Fmacmon)（用于 Apple Silicon 的硬件监控）\n\n  请安装本仓库使用的指定 fork 版本，而非 Homebrew 的 `macmon`。\n  Homebrew 的 `macmon 0.6.1` 在 Apple M5 上仍会崩溃。\n\n  ```bash\n  cargo install --git https:\u002F\u002Fgithub.com\u002Fswiftraccoon\u002Fmacmon \\\n    --rev 9154d234f763fbeffdcb4135d0bbbaf80609699b \\\n    macmon \\\n    --force\n  ```\n\n克隆仓库，构建仪表盘，然后运行 exo：\n\n```bash\n# 克隆 exo\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\n\n# 构建仪表盘\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# 运行 exo\nuv run exo\n```\n\n这将启动 exo 仪表盘和 API，地址为 http:\u002F\u002Flocalhost:52415\u002F\n\n\n*请查看 RDMA 相关章节，以在 macOS >=26.2 上启用此功能！*\n\n### 从源码运行（Linux）\n\n**前置条件：**\n\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv)（用于 Python 依赖管理）\n- [node](https:\u002F\u002Fgithub.com\u002Fnodejs\u002Fnode)（用于构建仪表盘）- 版本 18 或更高\n- [rust](https:\u002F\u002Fgithub.com\u002Frust-lang\u002Frustup)（用于构建 Rust 绑定，目前需要 nightly 版本）\n\n**安装方法：**\n\n**选项 1：使用系统包管理器（以 Ubuntu\u002FDebian 为例）：**\n```bash\n# 安装 Node.js 和 npm\nsudo apt update\nsudo apt install nodejs npm\n\n# 安装 uv\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n\n# 安装 Rust（使用 rustup）\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**选项 2：在 Linux 上使用 Homebrew（如偏好此方式）：**\n```bash\n# 在 Linux 上安装 Homebrew\n\u002Fbin\u002Fbash -c \"$(curl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002FHomebrew\u002Finstall\u002FHEAD\u002Finstall.sh)\"\n\n# 安装依赖\nbrew install uv node\n\n# 安装 Rust（使用 rustup）\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**注意：** `macmon` 包仅适用于 macOS，Linux 上不需要。\n\n克隆仓库，构建仪表盘，然后运行 exo：\n\n```bash\n# 克隆 exo\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\n\n# 构建仪表盘\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# 运行 exo\nuv run exo\n```\n\n这会在 http:\u002F\u002Flocalhost:52415\u002F 启动 exo 仪表盘和 API。\n\n**Linux 用户重要提示：** 目前，exo 在 Linux 上仅使用 CPU 运行。Linux 平台的 GPU 支持正在开发中。如果您希望看到对特定 Linux 硬件的支持，请[搜索现有的功能请求](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues)或创建新的功能请求。\n\n**配置选项：**\n\n- `--no-worker`：在不运行 worker 组件的情况下启动 exo。适用于仅作为协调器（coordinator）的节点，这类节点负责网络和编排，但不执行推理任务。对于 GPU 资源不足但网络连接良好的机器很有帮助。\n\n  ```bash\n  uv run exo --no-worker\n  ```\n\n**文件位置（Linux）：**\n\nexo 在 Linux 上遵循 [XDG 基础目录规范](https:\u002F\u002Fspecifications.freedesktop.org\u002Fbasedir-spec\u002Fbasedir-spec-latest.html)：\n\n- **配置文件**：`~\u002F.config\u002Fexo\u002F`（或 `$XDG_CONFIG_HOME\u002Fexo\u002F`）\n- **数据文件**：`~\u002F.local\u002Fshare\u002Fexo\u002F`（或 `$XDG_DATA_HOME\u002Fexo\u002F`）\n- **缓存文件**：`~\u002F.cache\u002Fexo\u002F`（或 `$XDG_CACHE_HOME\u002Fexo\u002F`）\n- **日志文件**：`~\u002F.cache\u002Fexo\u002Fexo_log\u002F`（自动日志轮转）\n- **自定义模型卡片**：`~\u002F.local\u002Fshare\u002Fexo\u002Fcustom_model_cards\u002F`\n\n您可以通过设置相应的 XDG 环境变量来覆盖这些位置。\n\n### macOS 应用\n\nexo 提供了一个 macOS 应用，可在您的 Mac 后台运行。\n\n\u003Cimg src=\"https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_readme_ac9f42a812e2.png\" alt=\"exo macOS 应用 - 在 MacBook 上运行\" width=\"35%\" \u002F>\n\nmacOS 应用需要 macOS Tahoe 26.2 或更高版本。\n\n在此下载最新版本：[EXO-latest.dmg](https:\u002F\u002Fassets.exolabs.net\u002FEXO-latest.dmg)。\n\n应用会请求修改系统设置和安装新网络配置文件的权限。相关改进正在进行中。\n\n**用于集群隔离的自定义命名空间：**\n\nmacOS 应用包含自定义命名空间功能，允许您将 exo 集群与同一网络上的其他集群隔离。此功能通过 `EXO_LIBP2P_NAMESPACE` 设置进行配置：\n\n- **使用场景**：\n  - 在同一网络上运行多个独立的 exo 集群\n  - 将开发\u002F测试集群与生产集群隔离\n  - 防止意外加入集群\n\n- **配置**：在应用的高级设置中访问此设置（或在从源码运行时设置 `EXO_LIBP2P_NAMESPACE` 环境变量）\n\n命名空间会在启动时记录，用于调试目的。\n\n#### 卸载 macOS 应用\n\n推荐的卸载方式是通过应用本身：点击菜单栏图标 → 高级 → 卸载。这会干净地移除所有系统组件。\n\n如果您已经删除了应用，可以运行独立的卸载脚本：\n\n```bash\nsudo .\u002Fapp\u002FEXO\u002Funinstall-exo.sh\n```\n\n这会移除：\n- 网络设置的 LaunchDaemon\n- 网络配置脚本\n- 日志文件\n- \"exo\" 网络位置\n\n**注意：** 您需要手动从系统设置 → 通用 → 登录项中移除 EXO。\n\n---\n\n### 在 macOS 上启用 RDMA\n\nRDMA（远程直接内存访问，Remote Direct Memory Access）是 macOS 26.2 新增的功能。它适用于任何配备 Thunderbolt 5 的 Mac（M4 Pro Mac Mini、M4 Max Mac Studio、M4 Max MacBook Pro、M3 Ultra Mac Studio）。\n\n请参阅注意事项以进行即时故障排除。\n\n要在 macOS 上启用 RDMA，请按以下步骤操作：\n\n1. 关闭 Mac。\n2. 按住电源按钮 10 秒，直到出现启动菜单。\n3. 选择\"选项\"进入恢复模式。\n4. 当恢复界面出现时，从实用工具菜单打开终端。\n5. 在终端中输入：\n   ```\n   rdma_ctl enable\n   ```\n   然后按回车键。\n6. 重新启动 Mac。\n\n之后，RDMA 将在 macOS 中启用，exo 会自动处理其余配置。\n\n**重要注意事项**\n\n1. 希望加入 RDMA 集群的设备必须与集群中的所有其他设备连接。\n2. 线缆必须支持 TB5。\n3. 在 Mac Studio 上，不能使用以太网端口旁边的 Thunderbolt 5 端口。\n4. 如果从源码运行，请使用 `tmp\u002Fset_rdma_network_config.sh` 脚本，该脚本会禁用 Thunderbolt Bridge 并在每个 RDMA 端口上设置 DHCP。\n5. RDMA 端口可能无法在不同版本的 macOS 上相互发现。请确保所有设备的操作系统版本完全匹配（包括 beta 版本号）。\n\n---\n\n## 环境变量\n\nexo 支持多个用于配置的环境变量：\n\n| 变量 | 描述 | 默认值 |\n|----------|-------------|---------|\n| `EXO_DEFAULT_MODELS_DIR` | 模型下载和缓存的默认目录。始终位于可写目录列表的首位。 | `~\u002F.local\u002Fshare\u002Fexo\u002Fmodels`（Linux）或 `~\u002F.exo\u002Fmodels`（macOS） |\n| `EXO_MODELS_DIRS` | 以冒号分隔的额外可写目录，用于模型下载。按顺序在默认目录之后检查；使用第一个有足够可用空间的目录。 | 无 |\n| `EXO_MODELS_READ_ONLY_DIRS` | 以冒号分隔的只读目录，用于搜索预下载的模型（例如 NFS 挂载、共享存储）。此处的模型无法删除。 | 无 |\n| `EXO_OFFLINE` | 在无网络连接的情况下运行（仅使用本地模型） | `false` |\n| `EXO_ENABLE_IMAGE_MODELS` | 启用图像模型支持 | `false` |\n| `EXO_LIBP2P_NAMESPACE` | 用于集群隔离的自定义命名空间 | 无 |\n| `EXO_FAST_SYNCH` | 控制 MLX_METAL_FAST_SYNCH 行为（用于 JACCL 后端） | 自动 |\n| `EXO_TRACING_ENABLED` | 启用分布式追踪以进行性能分析 | `false` |\n\n**使用示例：**\n\n```bash\n# 从 NFS 挂载使用预下载的模型（只读）\nEXO_MODELS_READ_ONLY_DIRS=\u002Fmnt\u002Fnfs\u002Fmodels:\u002Fopt\u002Fai-models uv run exo\n\n# 将模型下载到外部 SSD（如果已满则回退到默认目录）\nEXO_MODELS_DIRS=\u002FVolumes\u002FExternalSSD\u002Fexo-models uv run exo\n```\n\n```markdown\n# 离线模式运行\nEXO_OFFLINE=true uv run exo\n\n# 启用图像模型\nEXO_ENABLE_IMAGE_MODELS=true uv run exo\n\n# 使用自定义命名空间进行集群隔离\nEXO_LIBP2P_NAMESPACE=my-dev-cluster uv run exo\n```\n\n---\n\n### 使用 API\n\nexo 提供多种 API 兼容接口，以最大程度兼容现有工具：\n\n- **OpenAI Chat Completions API** - 兼容 OpenAI 客户端\n- **Claude Messages API** - 兼容 Anthropic 的 Claude 格式\n- **OpenAI Responses API** - 兼容 OpenAI 的 Responses 格式\n- **Ollama API** - 兼容 Ollama 及 OpenWebUI 等工具\n\n如果你希望通过 API 与 exo 交互，以下示例展示了如何创建一个小型模型实例（`mlx-community\u002FLlama-3.2-1B-Instruct-4bit`）、发送聊天补全请求以及删除实例。\n\n---\n\n**1. 预览实例部署方案**\n\n`\u002Finstance\u002Fpreviews` 端点会预览模型的所有有效部署方案。\n\n```bash\ncurl \"http:\u002F\u002Flocalhost:52415\u002Finstance\u002Fpreviews?model_id=llama-3.2-1b\"\n```\n\n示例响应：\n\n```json\n{\n  \"previews\": [\n    {\n      \"model_id\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n      \"sharding\": \"Pipeline\",\n      \"instance_meta\": \"MlxRing\",\n      \"instance\": {...},\n      \"memory_delta_by_node\": {\"local\": 729808896},\n      \"error\": null\n    }\n    \u002F\u002F ...可能有更多部署方案...\n  ]\n}\n```\n\n这将返回该模型的所有有效部署方案。选择你喜欢的方案。\n要选择第一个，使用 `jq` 管道：\n\n```bash\ncurl \"http:\u002F\u002Flocalhost:52415\u002Finstance\u002Fpreviews?model_id=llama-3.2-1b\" | jq -c '.previews[] | select(.error == null) | .instance' | head -n1\n```\n\n---\n\n**2. 创建模型实例**\n\n向 `\u002Finstance` 发送 POST 请求，在 `instance` 字段中指定你想要的部署方案（完整负载必须与 `CreateInstanceParams` 中的类型匹配），可从第 1 步复制：\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Finstance \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"instance\": {...}\n  }'\n```\n\n示例响应：\n\n```json\n{\n  \"message\": \"Command received.\",\n  \"command_id\": \"e9d1a8ab-....\"\n}\n```\n\n---\n\n**3. 发送聊天补全请求**\n\n现在，向 `\u002Fv1\u002Fchat\u002Fcompletions` 发送 POST 请求（与 OpenAI API 格式相同）：\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"What is Llama 3.2 1B?\"}\n    ],\n    \"stream\": true\n  }'\n```\n\n---\n\n**4. 删除实例**\n\n完成后，通过实例 ID 删除实例（可通过 `\u002Fstate` 或 `\u002Finstance` 端点查找）：\n\n```bash\ncurl -X DELETE http:\u002F\u002Flocalhost:52415\u002Finstance\u002FYOUR_INSTANCE_ID\n```\n\n### Claude Messages API 兼容性\n\n使用 `\u002Fv1\u002Fmessages` 端点，采用 Claude Messages API 格式：\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fmessages \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"max_tokens\": 1024,\n    \"stream\": true\n  }'\n```\n\n### OpenAI Responses API 兼容性\n\n使用 `\u002Fv1\u002Fresponses` 端点，采用 OpenAI Responses API 格式：\n\n```bash\ncurl -N -X POST http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fresponses \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"stream\": true\n  }'\n```\n\n### Ollama API 兼容性\n\nexo 支持 Ollama API 端点，以兼容 OpenWebUI 等工具：\n\n```bash\n# Ollama 聊天\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Follama\u002Fapi\u002Fchat \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model\": \"mlx-community\u002FLlama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello\"}\n    ],\n    \"stream\": false\n  }'\n\n# 列出模型（Ollama 格式）\ncurl http:\u002F\u002Flocalhost:52415\u002Follama\u002Fapi\u002Ftags\n```\n\n### 从 HuggingFace 加载自定义模型\n\n你可以从 HuggingFace Hub 添加自定义模型：\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:52415\u002Fmodels\u002Fadd \\\n  -H 'Content-Type: application\u002Fjson' \\\n  -d '{\n    \"model_id\": \"mlx-community\u002Fmy-custom-model\"\n  }'\n```\n\n**安全提示：**\n\n配置中需要 `trust_remote_code` 的自定义模型必须显式启用（默认为 false）以确保安全。仅在信任模型远程代码执行的情况下启用此选项。模型从 HuggingFace 获取并作为自定义模型卡片本地存储。\n\n**其他有用的 API 端点：**\n\n- 列出所有模型：`curl http:\u002F\u002Flocalhost:52415\u002Fmodels`\n- 仅列出已下载模型：`curl http:\u002F\u002Flocalhost:52415\u002Fmodels?status=downloaded`\n- 搜索 HuggingFace：`curl \"http:\u002F\u002Flocalhost:52415\u002Fmodels\u002Fsearch?query=llama&limit=10\"`\n- 查看实例 ID 和部署状态：`curl http:\u002F\u002Flocalhost:52415\u002Fstate`\n\n更多详情，请参阅：\n\n- [docs\u002Fapi.md](docs\u002Fapi.md) 中的 API 文档\n- [src\u002Fexo\u002Fmaster\u002Fapi.py](src\u002Fexo\u002Fmaster\u002Fapi.py) 中的 API 类型和端点\n\n---\n\n## 基准测试\n\n`exo-bench` 工具用于测量不同部署配置下的模型预填充（prefill）和 Token 生成速度。这有助于优化模型性能并验证改进效果。\n\n**前置条件：**\n- 基准测试前节点应已运行 `uv run exo`\n- 该工具使用 `\u002Fbench\u002Fchat\u002Fcompletions` 端点\n\n**基本用法：**\n\n```bash\nuv run bench\u002Fexo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,256,512 \\\n  --tg 128,256\n```\n\n**关键参数：**\n\n- `--model`：要基准测试的模型（短 ID 或 HuggingFace ID）\n- `--pp`：提示词大小提示（逗号分隔的整数）\n- `--tg`：生成长度（逗号分隔的整数）\n- `--max-nodes`：将部署方案限制为 N 个节点（默认：4）\n- `--instance-meta`：按 `ring`、`jaccl` 或 `both` 过滤（默认：both）\n- `--sharding`：按 `pipeline`、`tensor` 或 `both` 过滤（默认：both）\n- `--repeat`：每种配置的重复次数（默认：1）\n- `--warmup`：每种部署方案的预热运行次数（默认：0）\n- `--json-out`：结果输出文件（默认：bench\u002Fresults.json）\n\n**带过滤的示例：**\n\n```bash\nuv run bench\u002Fexo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,512 \\\n  --tg 128 \\\n  --max-nodes 2 \\\n  --sharding tensor \\\n  --repeat 3 \\\n  --json-out my-results.json\n```\n\n该工具输出性能指标，包括每种配置的每秒提示词 Token 数（prompt_tps）、每秒生成 Token 数（generation_tps）和峰值内存使用量。\n\n---\n\n## 硬件加速器支持\n\n在 macOS 上，exo 使用 GPU。在 Linux 上，exo 目前运行在 CPU 上。我们正在努力扩展硬件加速器支持。如果你希望支持新的硬件平台，请[搜索现有功能请求](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues)并点赞，以便我们了解社区重视的硬件。\n\n---\n\n## 贡献\n\n请参阅 [CONTRIBUTING.md](CONTRIBUTING.md) 了解如何为 exo 做出贡献的指南。","# exo 快速上手指南\n\n## 环境准备\n\n### 系统要求\n\n| 平台 | 要求 |\n|:---|:---|\n| **macOS** | macOS Tahoe 26.2+（推荐），支持 Apple Silicon |\n| **Linux** | 主流发行版，目前仅支持 CPU 运行 |\n| **硬件** | Thunderbolt 5 设备可启用 RDMA 加速（M4 Pro\u002FMax、M3 Ultra 等）|\n\n### 前置依赖\n\n**macOS 必需：**\n- [Xcode](https:\u002F\u002Fdeveloper.apple.com\u002Fxcode\u002F)（提供 Metal 工具链）\n- [Homebrew](https:\u002F\u002Fbrew.sh\u002F)\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv)（Python 包管理）\n- [Node.js](https:\u002F\u002Fnodejs.org\u002F)（构建仪表盘）\n- [Rust](https:\u002F\u002Frustup.rs\u002F)（nightly 工具链）\n- [macmon](https:\u002F\u002Fgithub.com\u002Fvladkens\u002Fmacmon)（硬件监控，需指定版本）\n\n**Linux 必需：**\n- [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv)\n- [Node.js](https:\u002F\u002Fnodejs.org\u002F) 18+\n- [Rust](https:\u002F\u002Frustup.rs\u002F) nightly\n\n---\n\n## 安装步骤\n\n### macOS 安装\n\n**方式一：使用 Nix（推荐，可跳过大部分步骤）**\n\n```bash\n# 添加 Cachix 缓存配置到 \u002Fetc\u002Fnix\u002Fnix.conf\n# trusted-users = root\n# experimental-features = nix-command flakes\n\n# 重启 Nix 守护进程后运行\nsudo launchctl kickstart -k system\u002Forg.nixos.nix-daemon\nnix run .#exo\n```\n\n**方式二：手动安装**\n\n```bash\n# 1. 安装 Homebrew\n\u002Fbin\u002Fbash -c \"$(curl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002FHomebrew\u002Finstall\u002FHEAD\u002Finstall.sh)\"\n\n# 2. 安装依赖\nbrew install uv node\n\n# 3. 安装 Rust nightly\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n\n# 4. 安装指定版本 macmon（Homebrew 版本在 M5 上不稳定）\ncargo install --git https:\u002F\u002Fgithub.com\u002Fswiftraccoon\u002Fmacmon \\\n  --rev 9154d234f763fbeffdcb4135d0bbbaf80609699b \\\n  macmon --force\n\n# 5. 克隆并构建\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# 6. 运行\nuv run exo\n```\n\n**方式三：macOS App（最简单）**\n\n下载：[EXO-latest.dmg](https:\u002F\u002Fassets.exolabs.net\u002FEXO-latest.dmg)\n\n> 需要 macOS 26.2+，首次运行需授权修改系统设置和网络配置。\n\n---\n\n### Linux 安装\n\n```bash\n# Ubuntu\u002FDebian 示例\nsudo apt update\nsudo apt install nodejs npm\n\n# 安装 uv\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n\n# 安装 Rust nightly\ncurl --proto '=https' --tlsv1.2 -sSf https:\u002F\u002Fsh.rustup.rs | sh\nrustup toolchain install nightly\n\n# 克隆并构建\ngit clone https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\ncd exo\u002Fdashboard && npm install && npm run build && cd ..\n\n# 运行\nuv run exo\n```\n\n> **注意：** Linux 目前仅支持 CPU 推理，GPU 支持正在开发中。\n\n---\n\n## 基本使用\n\n### 启动服务\n\n```bash\n# 默认启动（自动发现集群设备）\nuv run exo\n\n# 仅作为协调节点（不执行推理）\nuv run exo --no-worker\n```\n\n服务启动后：\n- **仪表盘**：http:\u002F\u002Flocalhost:52415\n- **API 端点**：http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fchat\u002Fcompletions\n\n### 使用 OpenAI 兼容 API\n\n```bash\ncurl http:\u002F\u002Flocalhost:52415\u002Fv1\u002Fchat\u002Fcompletions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"llama-3.1-8b\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]\n  }'\n```\n\n### 启用 RDMA 加速（macOS 26.2+）\n\n1. 关机后按住电源键 10 秒进入恢复模式\n2. 选择「Options」→「Utilities」→「Terminal」\n3. 执行：`rdma_ctl enable`\n4. 重启\n\n> 需满足：所有设备 Thunderbolt 5 直连、TB5 线缆、Mac Studio 避开以太网口旁的 TB 口。\n\n---\n\n## 配置文件位置\n\n| 类型 | 路径 |\n|:---|:---|\n| 配置 | `~\u002F.config\u002Fexo\u002F` |\n| 数据 | `~\u002F.local\u002Fshare\u002Fexo\u002F` |\n| 缓存 | `~\u002F.cache\u002Fexo\u002F` |\n| 日志 | `~\u002F.cache\u002Fexo\u002Fexo_log\u002F` |\n| 自定义模型 | `~\u002F.local\u002Fshare\u002Fexo\u002Fcustom_model_cards\u002F` |","某 AI 创业公司的小团队（3 名算法工程师）正在开发一款法律文档分析产品，需要在本地测试 DeepSeek-R1 671B 大模型的推理效果，以评估是否值得采购云端 API。\n\n### 没有 exo 时\n\n- **硬件成本高昂**：单台能跑 671B 模型的服务器（8×A100 80GB）售价超 20 万元，远超初创公司预算，团队只能租用云端实例，每小时成本约 50 元\n- **测试流程繁琐**：每次调试都需上传敏感法律合同到第三方云服务商，合规审批流程长达 2-3 天，严重拖慢迭代速度\n- **资源严重闲置**：工程师各自配有 M3 Max MacBook Pro（128GB 统一内存），但只能单机运行 70B 小模型，无法验证大模型在真实业务场景的表现\n- **协作效率低下**：三人各自独立测试，结果无法复现，模型性能数据分散在个人电脑中，团队难以形成统一评估结论\n\n### 使用 exo 后\n\n- **零额外硬件投入**：通过 exo 将 3 台 MacBook Pro 自动组建成 AI 集群，利用 RDMA over Thunderbolt 技术实现设备间高速通信，总显存达 384GB，足以本地运行 671B 4-bit 量化模型\n- **数据完全本地化**：敏感法律文档始终留在内网环境，工程师随时启动测试，迭代周期从数天缩短至数小时\n- **算力灵活调度**：exo 的拓扑感知自动并行根据实时负载动态分配任务，单台设备外出办公时，其余两台自动接管，集群持续可用\n- **无缝接入现有工作流**：通过 OpenAI 兼容 API，直接对接团队已搭建的 LangChain 测试框架，无需修改代码即可对比不同模型的输出质量\n\n**核心价值**：exo 让分散的消费级设备变身企业级 AI 算力集群，以零云成本实现大模型的本地化、隐私化、协作化开发。","https:\u002F\u002Foss.gittoolsai.com\u002Fimages\u002Fexo-explore_exo_d29a941f.png","exo-explore","EXO","https:\u002F\u002Foss.gittoolsai.com\u002Favatars\u002Fexo-explore_a5367c66.jpg","Edge ML",null,"hello@exolabs.net","exolabs","https:\u002F\u002Fgithub.com\u002Fexo-explore",[85,89,93,97,101,105,109,113,117,121],{"name":86,"color":87,"percentage":88},"Python","#3572A5",64.4,{"name":90,"color":91,"percentage":92},"Svelte","#ff3e00",21.5,{"name":94,"color":95,"percentage":96},"Swift","#F05138",6.5,{"name":98,"color":99,"percentage":100},"TypeScript","#3178c6",4,{"name":102,"color":103,"percentage":104},"Rust","#dea584",1.7,{"name":106,"color":107,"percentage":108},"Nix","#7e7eff",1,{"name":110,"color":111,"percentage":112},"Shell","#89e051",0.7,{"name":114,"color":115,"percentage":116},"CSS","#663399",0.3,{"name":118,"color":119,"percentage":120},"Just","#384d54",0,{"name":122,"color":123,"percentage":120},"JavaScript","#f1e05a",43315,2982,"2026-04-05T11:21:34","Apache-2.0","macOS, Linux","macOS: Apple Silicon (M3 Ultra\u002FM4 Pro\u002FM4 Max等) 推荐，支持RDMA over Thunderbolt 5；Linux: 目前仅CPU运行，GPU支持开发中","未说明",{"notes":132,"python":130,"dependencies":133},"macOS需要Xcode提供Metal ToolChain；支持Nix一键运行；macOS App需要macOS Tahoe 26.2+；RDMA功能需要macOS 26.2+和Thunderbolt 5硬件，需在恢复模式启用；设备自动发现无需手动配置；支持从HuggingFace加载自定义模型；提供OpenAI\u002FClaude\u002FOllama兼容API",[134,135,136,137,138,139],"uv","node>=18","rust (nightly)","mlx","mlx-distributed","macmon (macOS only)",[26,13],37,"2026-03-27T02:49:30.150509","2026-04-06T02:32:42.542067",[145,150,155,160,165,170,175,179],{"id":146,"question_zh":147,"answer_zh":148,"source_url":149},4310,"Exo 项目是否还在维护？","项目仍在积极开发中。1.0 版本已在 main 分支上发布，团队正在清理仓库中的问题，并将尽快发布更正式的版本。欢迎社区贡献。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F819",{"id":151,"question_zh":152,"answer_zh":153,"source_url":154},4311,"Ubuntu 24.04 和 Python 3.12 上遇到 clang 编译错误（-march=native 不支持）怎么办？","这是一个已知的 CPU 兼容性问题，特别是在 Raspberry Pi 5 等设备上。解决方案是应用 tinygrad 的补丁。社区成员已提交 PR #542 修复此问题。临时解决方案：修改 tinygrad 代码以移除或替换 -march=native 编译选项。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F458",{"id":156,"question_zh":157,"answer_zh":158,"source_url":159},4312,"Exo 是否支持 NVIDIA Jetson 设备？","Jetson 支持目前存在问题。主要障碍是 device_capabilities 无法被正确识别。有用户成功在 Jetson AGX Orin 上运行，但性能远低于预期（仅 5 tokens\u002F秒，而硬件应支持更高）。作为替代方案，可以尝试使用 llama.cpp 的 RPC 示例来实现 Jetson 集群，但网络速度会成为瓶颈。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F285",{"id":161,"question_zh":162,"answer_zh":163,"source_url":164},4313,"下载模型时出现 401 错误，提示 HuggingFace 仓库不存在怎么办？","特定模型（如 TriAiExperiments\u002FSFR-Iterative-DPO-LLaMA-3-70B-R）的链接可能已失效或变为私有。此问题在 Exo 1.0 版本中已修复。建议升级到最新版本，或检查模型是否仍可在 HuggingFace 上公开访问。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F728",{"id":166,"question_zh":167,"answer_zh":168,"source_url":169},4314,"Mac Studio M3 Ultra 使用雷雳 5 连接但无法使用 RDMA 怎么办？","这是一个已知问题。M3 Ultra 的雷雳 5 连接理论上支持 80 Gbps，但 Exo 目前无法利用 RDMA。临时解决方案：尝试使用 --disable-tunnel 标志禁用隧道功能，或等待官方修复。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F1050",{"id":171,"question_zh":172,"answer_zh":173,"source_url":174},4315,"Exo 是否支持 AMD GPU（ROCm）或旧矿卡？","项目团队表示更倾向于支持用户已有的设备，而非鼓励购买折扣服务器或矿卡。不过，正在开发的 Vulkan\u002FROCm 支持应该能够兼容这些硬件（如果用户已有）。BC-250 等设备的统一内存架构确实很有吸引力。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F893",{"id":176,"question_zh":177,"answer_zh":178,"source_url":154},4316,"Exo 是否支持纯 CPU 运行？","支持，但性能有限且存在一些技术问题。CPU 推理速度较慢，有用户报告在第一个提示后会出现无限挂起的问题。需要应用特定补丁来解决 clang 编译器兼容性问题。建议仅在无 GPU 可用时使用，且不要对性能有过高期望。",{"id":180,"question_zh":181,"answer_zh":182,"source_url":183},4317,"Exo 的悬赏计划（Bounty）是否还在进行？","之前的悬赏计划已被关闭，因为大多数悬赏都已过时。团队表示未来可能会重新引入悬赏机制，但需要先建立与开源社区更好的协作关系。目前建议关注官方仓库的活跃开发动态。","https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fissues\u002F238",[185,190,195,200,205,210,215,220,225,230,235,240,245,250],{"id":186,"version":187,"summary_zh":188,"released_at":189},103764,"v1.0.69","# EXO v1.0.69 Release Notes\r\n\r\nThis release ships with continuous batching, Qwen3.5 support and support for M5 Pro\u002FMax chips, as well as a host of quality of life improvements and bug fixes.\r\n\r\n**Continuous batching** is on by default, enabling you to run multiple requests in parallel for significantly higher throughput. EXO will automatically batch together inference requests, on single node and multi-node instances including RDMA instances. This is particularly useful for agentic workflows where multiple agents can run in parallel.\r\n\r\n## Models\r\n\r\n- Add support for Qwen3.5 ([#1644](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1644))\r\n- Add support for Nemotron sharding ([#1693](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1693))\r\n- Add default model cards for DeepSeek v3.2 ([#1769](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1769))\r\n\r\n## API\r\n\r\n- Add POST \u002Fv1\u002Fcancel\u002F{command_id} endpoint for cancelling ongoing text generations ([#1579](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1579))\r\n- Add reasoning params to chat completions and responses APIs ([#1654](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1654))\r\n- Add `repetition_penalty` and `repetition_context_size` to chat completions ([#1665](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1665))\r\n\r\n## Performance\r\n\r\n- Continuous batching ([#1642](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1579), [#1632](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1632), [#1777](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1777))\r\n- Better pipeline parallel prefill that splits the prompt into chunks and overlaps computation and communication. This makes pipeline parallel prefill up to 1.98x faster on 2 nodes ([#1587](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1587), [#1629](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1629))\r\n\r\n## Quality of Life\r\n\r\n- Support trace deletion in dashboard ([#1628](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1628))\r\n- Make mini topology sidebar navigate to home on click ([#1616](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1616))\r\n- Show feedback that the model was successfully added when adding custom models from Huggingface ([#1661](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1661))\r\n- Enable global model search from huggingface, not just `mlx-community` ([#1661](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1661))\r\n- Mobile-friendly UI ([#1677](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1677))\r\n- Include power usage in exo-bench responses, enabling benchmarks to capture energy usage ([#1693](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1693))\r\n- Prefer nodes with more of the model downloaded when placing an instance ([#1767](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1767), [#1795](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1795))\r\n- Sync custom model cards across nodes ([#1768](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1768))\r\n- Add `--bootstrap-peers` and `--libp2p-port` for static peer discovery, bypassing mDNS which Is useful in environments where mDNS is unavailable ([#1690](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1690))\r\n\r\n## Image Generation (Experimental)\r\n\r\n- Update mflux to 0.16.9 ([#1751](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1751))\r\n- Support image generation cancellation ([#1774](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1774))\r\n\r\n## Bug Fixes\r\n\r\n- Upgrade macmon to fix macmon errors on M5 Pro\u002FMax. This fixes an issue where M5 Pro\u002FMax would not report memory or GPU usage stats and therefore could not participate in clusters ([#1747](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1747), [#1797](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1797))\r\n- Use tmpdir for MLX distributed coordination file, preventing local network access permission issues ([#1624](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1624))\r\n- Fix BrokenResourceError crash when immediately loading a model on start ([#1637](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1637))\r\n- Emit error chunks when a runner crashes in the middle of a request, preventing streams hanging forever when runners crash ([#1645](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1645))\r\n- Fix copy code button not working in dashboard ([#1659](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1659))\r\n- Fix re-downloads so that models can be downloaded again after being deleted via the dashboard ([#1658](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1658))\r\n- Reset download status to `DownloadPending` when a download is cancelled so that the API and dashboard reflect the correct model download status ([#1674](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1674))\r\n- Increase gossipsub message limit to 8MB, fixing requests with very large prompts ([#1671](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1671))\r\n- Clean up stale `state.runners` state when runners shut down ([#1684](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1684))\r\n- Fix emoji rendering in chat responses ([#1691](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1691))\r\n- Fix placement validation for tensor sharding and show an error message with the constraints when no valid placement is found ([#1669](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1669))\r\n- Normalise responses API tool ca","2026-03-27T01:17:37",{"id":191,"version":192,"summary_zh":193,"released_at":194},103765,"v1.0.68","# EXO v1.0.68 Release Notes\r\n\r\nThis is the biggest EXO release to date. We wanted to make sure we address the stability issues users were running into on previous versions and we think we've achieved that with this release. This release also comes with a whole load of new features and UX improvements, full list below. Thank you to everyone who submitted bug reports over the past few weeks - it helps us to improve EXO much faster.\r\n\r\n## Models\r\n\r\n- Add support for custom models from Huggingface ([#1368](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1368))\r\n- Add support for Qwen3-Coder-Next ([#1367](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1367))\r\n- Add support for Step 3.5 Flash ([#1460](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1460))\r\n- Add support for GLM 5 ([#1526](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1526)), ([#1529](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1529))\r\n- Add support for MiniMax M2.5 ([#1514](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1514))\r\n\r\n## API\r\n\r\n- Add support for Claude Messages API, enabling tools like Claude Code ([#1167](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1167))\r\n- Add support for OpenAI Responses API ([#1167](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1167))\r\n- Add usage and generation stats to API, enabling clients like OpenCode to consume stats including prompt tokens, completion tokens and total tokens ([#1333](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1333)), ([#1461](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1461))\r\n- Cancel text generation when API request is closed ([#1276](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1276))\r\n- Add support for Ollama API ([#1560](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1560))\r\n\r\n## Web Dashboard\r\n\r\n- Add redesigned model picker modal ([#1369](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1369)), ([#1377](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1377)), ([#1440](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1440)), ([#1470](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1470))\r\n- Display alternative tokens \u002F logprobs visualizer in chat responses ([#1180](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1180))\r\n- Redesign downloads page as model x node table ([#1465](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1465)), ([#1589](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1589)), ([#1581](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1581))\r\n- Add prefill progress bar for long prompts ([#1181](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1181)), ([#1557](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1557))\r\n- A new onboarding flow when running EXO for the first time ([#1533](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1533))\r\n- Automatic model selection \u002F model recommendations in web dashboard ([#1590](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1590))\r\n\r\n## Quality of Life\r\n\r\n- Show a more informative message in macOS app when installing network location ([#1309](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1309))\r\n- Migrate model cards to .toml files ([#1354](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1354))\r\n- Clean up exo gracefully on shutdown, preventing memory not being cleaned up on exit ([#1388](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1388))\r\n- Make topology updates more responsive by yielding from reachability checks instead of waiting for all checks ([#1427](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1427))\r\n- Allow typing in chat input while response is generating ([#1433](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1433))\r\n- Add log rotation, now exo logs get written to `~\u002F.exo\u002Fexo_logs` ([#1438](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1438)), ([#1439](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1439)), ([#1442](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1442))\r\n- Distinguish between model fits in available memory and fits in total memory in model picker ([#1441](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1441) h\u002Ft [@Hmbown](https:\u002F\u002Fgithub.com\u002FHmbown))\r\n- Add `enable_thinking` toggle for models that support thinking\u002Fnon-thinking ([#1457](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1457))\r\n- Show a warning in the web dashboard when macOS versions of nodes in a cluster are incompatible ([#1436](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1436))\r\n- Show macOS version in debug mode on web dashboard\r\n- Add cancellation button and cancel during prefill ([#1540](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1540)), ([#1575](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1575))\r\n- Strip Claude headers to improve prefix cache hit rates ([#1552](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1552))\r\n- Prioritise Thunderbolt for Ring (TCP\u002FIP) instances ([#1556](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1556))\r\n- Show paused downloads with completion % in web dashboard ([#1564](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1564))\r\n- Add support for loading models from arbitrary paths with `EXO_MODELS_PATH` environment variable ([#1574](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1574))\r\n\r\n## Image Generation (Experimental)\r\n\r\n- Add support for non-streaming image generation ([#1328](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1328))\r\n- Add support for parallel classifier-free guidance ","2026-02-25T20:38:50",{"id":196,"version":197,"summary_zh":198,"released_at":199},103766,"v1.0.67","# EXO v1.0.67 Release Notes\r\n\r\nThis release adds support for Kimi-K2.5 and tensor sharding support for MiniMax M2.1.\r\n\r\n## Models\r\n\r\n- Add support for Kimi K2.5 ([#1302](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1302))\r\n- Add tensor sharding support for MiniMax M2.1 ([#1299](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1299))\r\n\r\n**Full Changelog:** https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.66...v1.0.67","2026-01-28T07:09:36",{"id":201,"version":202,"summary_zh":203,"released_at":204},103767,"v1.0.66","# EXO v1.0.66 Release Notes\r\n\r\nThis is a stability release that fixes a regression with RDMA \u002F Tensor Parallelism where models were getting stuck in `LOADING` state. It also fixes a download edge case with GLM 4.7 Flash and nodes getting stuck in `UNKNOWN` state \u002F zombie states after periods of inactivity.\r\n\r\nAll models have been confirmed working with RDMA \u002F Tensor Parallel on various configurations (including Mac Minis, MacBooks and Mac Studios). Thank you to users who reported bugs to help us resolve these issues - it helps a lot!\r\n\r\n## Bug Fixes\r\n\r\n- Use EXO shard instead of upstream shard for all models, loading models layer-by-layer, fixing models getting stuck in `LOADING` state e.g.`GLM-4.7-Flash-4bit`, `gpt-oss-120b-MXFP4-Q8` and `Qwen3-Coder-480B-A35B-Instruct-8bit`. Also fixes memory not being released when an instance is deleted. ([#1291](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1291))\r\n- Fix downloads getting stuck when model files change in Huggingface repo e.g. `GLM-4.7-Flash-4bit` which was updated upstream on Jan 25 ([#1290](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1290))\r\n- Always publish info gatherer events, preventing nodes getting stuck in `UNKNOWN` state \u002F zombie states after a period of inactivity ([#1283](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1283))\r\n- Fix tool calls with empty text content ([#1292](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1292))\r\n\r\n## MLX\r\n\r\n- Upgrade `mlx-lm` to `0.30.5`\r\n\r\n**Full Changelog:** https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.65...v1.0.66","2026-01-26T22:41:31",{"id":206,"version":207,"summary_zh":208,"released_at":209},103768,"v1.0.65","# EXO v1.0.65 Release Notes\r\n\r\nThis release ships with stability fixes for RDMA and long running clusters (fixing the _FAILED_ -> _PREPARING_ loops for some clusters), as well as QoL features for managing downloads and a fix for GPT-OSS tool calling.\r\n\r\n## UX\r\n\r\n- Add download and delete buttons to downloads UI ([#1236](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1236), ([#1237](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1237)))\r\n\r\n## Bug Fixes\r\n\r\n- Fix parsing logic for GPT-OSS tool calling ([#1271](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1271))\r\n- Fix Thunderbolt bridge cycle detection to include 2-node cycles ([#1261](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1261))\r\n- Fix placement filter to use subset matching instead of exact match ([#1265](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1265))\r\n- Fix instance port assignment, improving stability for RDMA clusters ([#1268](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1268))\r\n- Deprioritise uncertain ethernet devices, improving stability for RDMA clusters ([#1267](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1267))\r\n- Restore Thunderbolt Bridge LaunchDaemon, since TB bridge gets enabled by macOS on reboot automatically ([#1270](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1270))\r\n- Add back the EXO network profile creation to the LaunchDaemon ([#1277](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1277))\r\n\r\n**Full Changelog:** https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.64...v1.0.65","2026-01-24T15:17:08",{"id":211,"version":212,"summary_zh":213,"released_at":214},103769,"v1.0.64","# EXO v1.0.64 Release Notes\r\n\r\nThis release comes with [support for GLM-4.7-Flash](https:\u002F\u002Fx.com\u002Fexolabs\u002Fstatus\u002F2013458583023698014), IP-less RDMA discovery (removing the need for custom network locations) and OpenAI-compatible tool calling via the API. It also includes bug fixes for auto parallelism, fixing various models including Qwen, GPT-OSS and MiniMax that were getting stuck in LOADING \u002F WARMING UP, as well as better error messages when things go wrong.\r\n\r\n## Model Support\r\n\r\n- Added support for GLM-4.7-Flash ([#1214](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1214))\r\n\r\n## API\r\n\r\n- Added tool calling support to the OpenAI-compatible chat completions API ([#1233](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1233))\r\n\r\n## UX\r\n\r\n- Add proxy and custom SSL certificate support for corporate networks ([#1189](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1189))\r\n- More responsive node info by splitting information sources and state ([#928](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F928), [#1209](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1209))\r\n- Less spammy logs ([#1218](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1218), [#1225](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1225))\r\n- Better error messages when things go wrong ([#1198](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1198), [#1173](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1173), [#1177](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1177))\r\n- Detect Thunderbolt Bridge cycles and show a native prompt to disable Thunderbolt Bridge ([#1222](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1222))\r\n- Add the ability to filter nodes for placement in the dashboard by clicking them ([#1248](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1248))\r\n\r\n## Bug Fixes\r\n\r\n- Fix placement edge cases for heterogeneous devices with lopsided memory availabilities ([#1200](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1200))\r\n- Fix various issues with auto parallel for loading and sharding models ([#1202](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1202), [#1201](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1201), [#1206](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1206), [#1211](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1211), [#1229](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1229), [#1223](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1223))\r\n- Prepend think tags for certain models that include a think tag in their chat template e.g. GLM 4.7 ([#1186](https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1186))\r\n\r\n**Full Changelog:** https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.63...v1.0.64","2026-01-23T02:14:59",{"id":216,"version":217,"summary_zh":218,"released_at":219},103770,"v1.0.63","# EXO v1.0.63 Release Notes\r\n\r\nThis release comes with stability improvements for long-running clusters, and support for GLM 4.7 and MiniMax M2.1.\r\n\r\n## Model Support\r\n\r\n- Added support for GLM 4.7 (#1147)\r\n- Added support for MiniMax M2.1 (#1147)\r\n- Tensor sharding for GPT-OSS (#1144)\r\n\r\n## UI\r\n\r\n- Uninstall button in macOS app (#1077)\r\n\r\n## Bug Fixes\r\n\r\n- Fix issues with nodes incorrectly dropping out of topology (#1164, #1170)\r\n- Fix `exo-bench` for transformers 5.x (#1168)\r\n- Fix gibberish responses for GPT-OSS (#1165)\r\n- Increase rlimit to avoid hitting resource limit errors when launching more than 10 instances (#1148)\r\n- Reduce false-positives on local network access warning in macOS app (#1136)\r\n- Use user-provided `seed` for sampling in generation, enabling deterministic responses (#1094)\r\n- Change status shown in Dashboard\u002FmacOS app from `UNKNOWN` to `PREPARING` (#1112)\r\n- Cleaner scrolling in instances box on Dashboard (#1113)\r\n\r\n**Full Changelog:** https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.62...v1.0.63","2026-01-16T17:42:27",{"id":221,"version":222,"summary_zh":223,"released_at":224},103771,"v1.0.62","## What's Changed\r\n* ci: avoid uploading alpha appcasts by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1110\r\n* ci: compute CURRENT_PROJECT_VERSION from semver by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1111\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.61...v1.0.62","2026-01-08T18:53:15",{"id":226,"version":227,"summary_zh":228,"released_at":229},103772,"v1.0.61","This release has everything packaged up from the original release of v1 Exo, and we're really excited for you to use it! There are several small fixes shipping here, and some nice to haves like being able to set `EXO_LIBP2P_NAMESPACE` from the Mac UI.\r\n\r\nThis time around we have a rudimentary changelog, as there's no clear git history between v1.0.60 -> v1.0.62. In future we'll try to be much better about this.\r\n\r\nca680185 Display RDMA debug info in macOS app. (#1072)\r\n383309e2 fmt: add typescript formatting\r\n55463a98 fmt: add swift formatting\r\n56af61fa add a server for distributed testing in \u002Ftests until we work out a stable solution. (#1098)\r\nf76d543d We shouldn't fail on an HTTPException in the tier-2 discovery system. (#1104)\r\nea841aca local network check (#1103)\r\n077b1bc7 exo-bench (Benchmark model pp & tg speed) (#1099)\r\n4963c331 Fix Discord link in README.md. Fixes #1096 (#1097)\r\n4f6fcd9e feat(macos-app): add custom namespace UI for cluster isolation\r\n839b67f3 [feat] Add an option to disable the worker (#1091)\r\n47b8e0ce feat: remember last launch settings (model, sharding, instance type) (#1028)\r\n17f9b583  Task Deduplication (#1062)\r\n844bcc7c fix: prevent form submission during IME composition (#1069)\r\nc1be5184 Fix tests broken by 283c (#1063)\r\n1ec550df Emit download progress on start, and change downloads to be keyed by model_id (#1044)\r\n283c0e39  Placement filters for tensor parallel supports_tensor, tensor dimension and pipeline parallel deepseek v3.1 (#1058)\r\n35be4c55 prioritise mlx jaccl coordinator ip (en0 -> en1 -> non-TB5 -> other)\r\n31d4cd84 set KV_CACHE_BITS to None to disable quantized kv cache\r\n8a6da584 remove mx.set_cache_limit\r\n16e2bfd3 log EXO_LIBP2P_NAMESPACE on start\r\nade3ee7e fix warmup order. should be rank!=0 then rank=0\r\nfea42473 Place local node at the top of the dashboard. (#1033)\r\nca7adcc2 Update README.md with instructions to enable RDMA. (#1031)\r\n9d9e24f9 some dashboard updates (#1017)\r\nb5d424b6 placement: generate per-node host lists for MLX ring backend\r\nb4651340 Fix Kimi K2 Thinking download by adding tiktoken.model to download patterns (#1024)\r\neabdcab9 Fix linux docs (#1022)\r\n8e9332d6 Separate out the Runner's behaviour into a \"connect\" phase and a \"load\" phase (#1006)\r\n4b65d5f8 Fix race condition in mlx_distributed_init with concurrent instances (#1012)\r\n1c1792f5 mlx: update to 0.30.1 and align coordinator naming with MLX conventions\r\n9afc1043 exo: handle -c flag for multiprocessing helpers in frozen apps\r\n70c423f5 feat: conform to XDG Base Directory Specification on Linux (#988)\r\na24bdf76 exo: enable multiprocessing support in PyInstaller bundles\r\ne8855959 build-app: add branch trigger from named branch\r\n0a7fe5d9 ci: migrate build-app to github hosted runners\r\n51a5191f format readme (#978)\r\n1efbd263 add architecture.md, move images to docs\u002Fimgs (#968)\r\n02c915a8 pyproject: drop pathlib dependency\r\nfc41bfa1 Add all prerequisites to README (#975)\r\ndd0638b7 pyproject: add pyinstaller to dev-dependencies\r\ne06830ce fix: update macOS app to use correct API port (52415)\r\n1df5079b ci: avoid pushing alpha build as latest\r\n1e75aeb2 Add Prerequisites to Readme (#936)\r\nc582bdd6 bugfix: Handle MacMon errors gracefully\r\n1bae8ebb ci: add build-app workflow\r\nabaeb032 Update README.md. (#956)\r\n7d15fbda readme tweaks5 (#954)\r\n4a6e0fe1 Update README.md. (#949)\r\nf4792dce fix(downloads): use certifi for robust SSL certificate verification (#941)\r\na1b14a27 Extend eos_token_id fix for other models (#938)\r\nf8483cfc Update README.md. (#932)\r\n8bafd6fe Update README.md (#925)\r\nf16afd72 nix: get rust build working on linux\r\n4da00432 Update README.md (#917)\r\n9e2bdeef LICENSE: Fix company name\u002Fyear\r\n379744fe exo: open source mac app and build process\r\n74bae3ba Update README.md\r\n9815283a 8000 -> 52415 (#915)\r\n5bd39e84 Merge pull request #914 from exo-explore\u002Fremove-old-cli-flag\r\n658cf5cc remove tb_only from master\r\n170d2dcb Add Windows as a potential planned platform\r\nba66f142 Merge pull request #912 from exo-explore\u002Fupdate-dashboard-error-message\r\n274e35f9 update readme\r\n3fe7bd25 update error message\r\n004fea69 clarify platform support\r\n5c2d254f add platform support information\r\n19ca48c4 more readme fixups\r\n57d38136 re-add LICENSE\r\n7cd1527c update CONTRIBUTING\r\n423c066e Merge pull request #906 from exo-explore\u002Fjj\u002Fsluxkvlmwons\r\nebf0e18c re-add logos\r\n28a6151b remove discord link from README\r\n2c16e00b github docs","2026-01-08T17:22:57",{"id":231,"version":232,"summary_zh":233,"released_at":234},103773,"v1.0.61-alpha.2","## What's Changed\r\n* local network check by @samiamjidkhan in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1103\r\n* We shouldn't fail on an HTTPException in the tier-2 discovery system. by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1104\r\n* add a server for distributed testing in \u002Ftests until we work out a stable solution. by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1098\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.61-alpha.1...v1.0.61-alpha.2","2026-01-08T12:51:00",{"id":236,"version":237,"summary_zh":238,"released_at":239},103774,"v1.0.61-alpha.1","## What's Changed\r\n* Fix Discord link in README.md. Fixes #1096 by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1097\r\n* exo-bench (Benchmark model pp & tg speed) by @rltakashige in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1099\r\n\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.61-alpha.0...v1.0.61-alpha.1","2026-01-06T17:43:13",{"id":241,"version":242,"summary_zh":243,"released_at":244},103775,"v1.0.61-alpha.0","## What's Changed\r\n* pyproject: add pyinstaller to dev-dependencies by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F972\r\n* Add all prerequisites to README by @rltakashige in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F975\r\n* pyproject: drop pathlib dependency by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F973\r\n* add architecture.md, move images to docs\u002Fimgs by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F968\r\n* format readme by @rltakashige in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F978\r\n* ci: migrate build-app to github hosted runners by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F981\r\n* build-app: add branch trigger from named branch by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F990\r\n* exo: enable multiprocessing support in PyInstaller bundles by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F989\r\n* feat: conform to XDG Base Directory Specification on Linux by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F988\r\n* exo: handle -c flag for multiprocessing helpers in frozen apps by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F995\r\n* mlx: update to 0.30.1 and align coordinator naming with MLX conventions by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F997\r\n* Fix race condition in mlx_distributed_init with concurrent instances by @heathdutton in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1012\r\n* Separate out the Runner's behaviour into a \"connect\" phase and a \"load\" phase by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1006\r\n* Fix linux docs by @MatiwosKebede in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1022\r\n* Fix Kimi K2 Thinking download by adding tiktoken.model to download patterns by @Drifter4242 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1024\r\n* placement: generate per-node host lists for MLX ring backend by @JakeHillion in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1000\r\n* some dashboard updates by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1017\r\n* Update README.md with instructions to enable RDMA. by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1031\r\n* Place local node at the top of the dashboard. by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1033\r\n* fix warmup order. should be rank!=0 then rank=0 by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1046\r\n* log EXO_LIBP2P_NAMESPACE on start by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1045\r\n* Remove mx.set_cache_limit by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1057\r\n* set KV_CACHE_BITS to None to disable quantized kv cache by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1056\r\n* Prioritise mlx jaccl coordinator ip (en0 -> en1 -> non-TB5 -> other) by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1059\r\n* Placement filters for tensor parallel supports_tensor, tensor dimension and pipeline parallel deepseek v3.1 8-bit by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1058\r\n* Emit download progress on start, and change downloads to be keyed by model_id by @AlexCheema in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1044\r\n* Fix tests broken by 283c by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1063\r\n* fix: prevent form submission during IME composition by @rickychen-infinirc in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1069\r\n* Task Deduplication by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1062\r\n* feat: remember last launch settings (model, sharding, instance type) by @Drifter4242 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1028\r\n* [feat] Add an option to disable the worker by @Evanev7 in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1091\r\n* feat(macos-app): add custom namespace UI for cluster isolation by @madanlalit in https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fpull\u002F1003\r\n\r\n**Full Changelog**: https:\u002F\u002Fgithub.com\u002Fexo-explore\u002Fexo\u002Fcompare\u002Fv1.0.60-alpha.1...v1.0.61-alpha.0","2026-01-05T17:54:41",{"id":246,"version":247,"summary_zh":248,"released_at":249},103776,"v1.0.60-alpha0","First release testing the full publish workflow from the OSS repo.","2025-12-22T14:17:24",{"id":251,"version":252,"summary_zh":253,"released_at":254},103777,"v1.0.60-alpha.1","Iterating on release process.","2025-12-22T15:20:29"]