Running OpenAI Codex locally for free through Ollama
Main chat
A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.
XX
Yesterday, the news exploded in the AI community: thanks to the update Ollama v0.24+, you can now run a full OpenAI Codex (both desktop and CLI) completely locally, for free, without API keys, rate limits and sending code to the OpenAI server.
This is a real breakthrough for developers who want privacy, unlimited use and top-end coding agent experience on their machines.
What is Codex and why is it important
Codex from OpenAI has long been considered one of the most powerful coding agents: it worked well with code review, browser-based editing, review mode and long-running tasks. But it was all tied to a paid API with limitations.
Now the situation has changed. Ollama added native support for Codex App and Codex CLI, allowing them to run with open models (Gemma 4, Qwen 3.6, DeepSeek V4 and others), which in many aspects rival or even surpass the original experience.
Key advantages of the local version:
- *Privacy: The code never leaves your car.
- Zero cost after initial setup.
- Unlimited use - no rate limits.
- Multimodality and tool-calling at the frontier model level.
- Full support for the built-in browser and review mode in the desktop application.
Step-by-step adjustment
1. Codex App (Desktop) - the most convenient option
- Update Ollama to **0.24 or later.
- Run the appropriate command for integration (the exact command is given in the original post).
- Select the model at the first start:
gemma4:31bis a great all-rounder with strong reasoning.qwen3.6is the top in coding performance.deepseek-v4for complex tasks.kimi-k2.6:cloud– if you need strong vision support.
Functions that work locally:
- Built-in browser (visual editing of sites through annotations directly on the page).
- Review mode (leave comments, iteratively improve the code).
- Full chat interface within the application.
2. Codex CLI – for terminal lovers
Launch with an open-source flag
codex --oss
# With a specific model
codex --oss -m gemma4
codex --oss -m qwen3. 6
codex --oss -m deepseek-v4
Recommendations on models and hardware
Before downloading large models, check compatibility with whatmodelscanirun.com – the service will instantly show what your graphics card and VRAM will pull.
Current top recommendations (May 2026):
- Gemma 4 31B - Balance of quality and speed.
- Qwen 3.6 is the best choice for most coding tasks.
- DeepSeek V4 - when you need maximum capabilities on complex projects.
Practical opportunities
**The built-in Codex App browser allows you to open local servers and live sites, annotate items directly on the page and ask the model to make changes. This radically simplifies frontend development and prototyping.
Review mode allows you to inspect proposed code changes, leave comments and iteratively refine without leaving the application - a real workflow as a senior developer.
Comparison with cloud alternatives (Claude Opus 4.8, GPT-5.5, etc.)
While Anthropic is releasing Opus 4.8 with a focus on honesty, dynamic workflows and agentic reliability, local Codex offers another advantage: complete independence.
When to choose local codex:
- Sensitive projects (corporate code, personal development).
- Constant work without limits.
- Testing hypotheses at no cost.
- Machines with a powerful GPU (RTX 4090 and above are recommended for larger models).
When is the cloud better (Claude Opus 4.8 / GPT-5.5):
- Maximum complex multi-agent tasks.
- Great contexts and fresh knowledge.
- When local iron is lacking.
Many developers now use a hybrid approach: Claude Opus 4.8 or GPT-5.5 for high-level planning and research, and local Codex for private implementation and iterations.
Edge cases and nuances
- *Performance is highly dependent on hardware. The 31B+ models require a minimum of 24–40+ GB VRAM for comfort.
- The quality of the open model may be slightly inferior to the original Codex on the most difficult reasoning tasks, but the difference is rapidly shrinking.
- Ollama updates come out often - watch out for releases.
- Windows/macOS/Linux support is different – check the documentation.
Conclusion: A New Era of Local AI Development
Ollama v0.24+ effectively democratized access to powerful coding agents. What used to cost tens to hundreds of dollars a month on the API now works for free and private on your PC.
This is especially timely against the backdrop of the release of Claude Opus 4.8 – developers get a choice: the power and reliability of cloud frontier models or the freedom and privacy of local solutions.
**We recommend you try this today. For many, this will be the main working tool in 2026.
Watch the full video guide for setting up: YouTube
*Source: Post @intheworldofai, official Ollama documentation, community reviews for May 2026. *