Native integration — autoyou_lib

Your local AI models. From your phone.

AutoYou integrates natively with Ollama, the local AI model runner. Install Ollama on your computer, pull any model, and start AutoYou. Your phone's chat routes to your local models through the encrypted peer-to-peer connection — no API key, no cloud, no cost per token. Every word stays on your hardware.

No API key No cloud LLaMA · Mistral · Gemma · Qwen Auto-detection Complete privacy Free

Why this matters

The AI on your phone should run on hardware you already own.

Every time you send a message to a cloud AI, your words travel to a third-party server, get processed there, and a response comes back. Ollama changes that: the model runs locally on your machine. AutoYou gets your messages there from your phone, over an encrypted peer-to-peer connection that never touches a third-party server.

Privacy

Your prompts never leave your hardware.

When you chat with an Ollama model through AutoYou, your message travels from your phone over the encrypted WebRTC DataChannel to your computer. The model processes it locally. The response comes back the same way. No message ever reaches a cloud AI provider. There's nothing to log, nothing to sell, and nothing to breach.

Cost

Zero per-token cost. Forever.

Ollama models run on your CPU or GPU — the hardware you already paid for. AutoYou is free. There are no API keys to purchase, no subscription tiers that meter your usage, and no bill at the end of the month based on how much you talked to your AI. Run as many messages as you like.

Choice

Hundreds of models. All yours.

Ollama's model library includes hundreds of open-source models from leading research labs — Meta, Mistral AI, Google DeepMind, Alibaba, Microsoft, and many more. Pull any of them with a single command. AutoYou detects what's installed and makes it available for chat immediately.

Setup

Three steps to local AI on your phone.

No configuration files. No API keys to hunt down. Just install Ollama, pull a model, and start AutoYou.

1

Install Ollama and pull a model

Download Ollama from ollama.com and install it on your Mac, Windows, or Linux computer. Then pull any model you want:

ollama pull llama3.2
# or
ollama pull mistral
# or any model from ollama.com/library

Ollama starts a local server on port 11434 and makes your pulled models available immediately.

2

Install and start AutoYou

pip install autoyou-lib
python -m autoyou_lib

AutoYou detects your running Ollama instance automatically on startup. No environment variables needed for a standard setup. Your installed models are discovered and made available for routing.

3

Pair your phone and start chatting

Pair the AutoYou mobile app (or desktop client) to your computer using the standard OTP or AutoPair flow. Open the Chat tab. Your messages route to your local Ollama model and responses stream back in real-time.

Not sure how to pair? See the full setup guide.

What you can run

Compatible models. A few highlights.

Any model available in the Ollama library works with AutoYou. Here are some popular choices — the full list is at ollama.com/library.

Meta

LLaMA 3 family

Meta's open-weight LLaMA models are among the most capable locally-runnable models available. LLaMA 3.1 and 3.2 come in sizes from 1B to 70B+ parameters — pick the size that fits your hardware.

ollama pull llama3.2
ollama pull llama3.1:70b

Mistral AI

Mistral & Mixtral

Mistral's models punch above their weight for their size. Mistral 7B and the Mixtral mixture-of-experts models offer strong instruction-following and reasoning on modest hardware.

ollama pull mistral
ollama pull mixtral

Google

Gemma 2

Google's Gemma 2 models are compact and efficient, well-suited for machines with limited GPU memory. The 2B and 9B variants are especially popular for fast, responsive local chat.

ollama pull gemma2
ollama pull gemma2:2b

Alibaba

Qwen 2.5

Alibaba's Qwen series offers strong multilingual capabilities alongside solid code and general reasoning. Available in sizes from 0.5B to 72B, with specialist coding and math variants.

ollama pull qwen2.5
ollama pull qwen2.5-coder

DeepSeek

DeepSeek R1

DeepSeek's R1 model offers chain-of-thought reasoning comparable to much larger models. Smaller distilled variants make it practical to run on consumer hardware.

ollama pull deepseek-r1
ollama pull deepseek-r1:7b

Microsoft

Phi-4

Microsoft's Phi series is engineered for efficiency. Phi-4 delivers strong performance for its size, making it an excellent choice for machines where memory is tight but you still want capable reasoning.

ollama pull phi4
ollama pull phi4-mini

Any model in the Ollama library works. The list above is a small sample. AutoYou detects all installed models automatically — there's nothing to configure when you add a new one.

How to integrate

Two ways to use Ollama with AutoYou.

Ollama works with AutoYou whether you're running the lightweight autoyou_lib server on its own or as part of a full OpenClaw AI gateway setup.

Direct — autoyou_lib

pip install autoyou-lib. That's it.

autoyou_lib is the lightweight AutoYou Python server. When it starts, it checks for a running Ollama instance on the default port and automatically routes chat to your installed models. Nothing extra to configure — install the library, start it, pair your phone, and your local AI is ready.

This is the simplest path. You get Ollama chat from your phone with two commands and no accounts.

pip install autoyou-lib
python -m autoyou_lib

Via OpenClaw

Ollama as one option among many.

If you're already running OpenClaw, you can add Ollama as one of its model providers — alongside Claude, GPT-4, Gemini, and others. The AutoYou plugin routes your phone's messages to whichever model OpenClaw currently has selected. Switch between a local Ollama model and a cloud model in OpenClaw without changing anything in the phone app.

This is the more powerful path: the full OpenClaw AI pipeline, with Ollama available as a local, private, zero-cost option whenever you want it.

Complete end-to-end privacy. Your message leaves your phone over the encrypted WebRTC DataChannel and arrives at your computer. Ollama processes it locally. The response returns the same way. AutoYou's servers are not in this path at any point. Neither is any AI provider's cloud. The only machines involved are the ones you own.

Works with the desktop client too.

The AutoYou desktop client for macOS and Windows connects to your AutoYou server the same way the mobile app does. If your server is routing to an Ollama model, the desktop client chats with it just as seamlessly. The experience is identical: real-time streaming, typing indicators, conversation history, and the same controls you get on mobile.

Running the desktop client on the same machine as your Ollama server means your chat never leaves your computer at all — the connection is localhost to localhost, tunneled through the local WebRTC stack. Fully offline capable once paired.

Get started

Local AI. From your phone. Right now.

Install Ollama, pull a model, install autoyou-lib, pair your phone. Four steps and your local AI is available from anywhere — privately, securely, and for free.