Skip to main content
Oasis is the local AI engine built into webAI. It runs large language models directly on your hardware using , MLX, or llama.cpp — so your conversations, data, and context never leave your machine.

Why Oasis?

  • Private by design. Your conversations and data never leave your machine. This isn’t a policy promise; it’s an architectural guarantee. There’s nothing to breach because there’s nothing sent.
  • Works without the internet. Oasis runs AI models locally using your device’s GPU (via WebGPU), so it works offline, on a plane, or anywhere else.
  • Nothing to install. Oasis is browser-native. If your browser supports WebGPU, you’re ready. No additional downloads or dependencies.
  • Adapts to you. Switch between personas — research, creative, builder — without reloading. One runtime, many contexts.
  • Can take action. Oasis isn’t just a chatbot. Its agent system can plan multi-step tasks, use tools, automate browser actions, and execute workflows — all locally.
The big idea: intelligence should be a property of your device and a resource of your network — not a service rented from the cloud. Oasis makes that real.

What you can do with Oasis

Have a private conversation

Chat with an AI that runs entirely on your hardware. Stream responses in real time, see reasoning traces, and adjust how the model thinks (temperature, sampling) without any data leaving your machine.

Create and switch personas

Define custom AI behaviors — a research assistant, a writing partner, a math tutor — each with their own system prompt, tool permissions, and fine-tuned adapters. Switch between them mid-conversation. See Personas for details.

Get rich, structured output

Oasis doesn’t just return text. It can render charts, comparison tables, code blocks, data cards, stat grids, and more — directly in the conversation.

Choose your model and backend

Load models from HuggingFace (0.5B to 32B parameters), pick your inference backend (WebGPU, MLX, or llama.cpp), and attach adapters — all from a settings panel, no command line required.
Browser-native GPU inference. Works on any platform with a supported browser (Chrome 113+, Edge 113+). Supports models up to 32B parameters depending on available GPU memory.
The system automatically selects the best backend for your hardware, or you can choose manually. See On-Device AI for the full breakdown of supported models, tiers, and hardware routing.

Supported models

Oasis supports models ranging from 0.5B to 235B parameters across all backends. A few highlights:
ModelSizeBest for
Qwen3 1.7BSmallQuick answers, everyday use on most devices
Qwen3 4BMediumBalanced performance and capability
Qwen3 8BMediumResearch, writing, and code
Qwen3 14BLargeDetailed reasoning and analysis
Qwen3 32BLargeComplex reasoning, multi-step tasks

Choosing a model

Not sure which model to pick? See the practical guide to choosing the right model for your hardware.