Atlas Guide · Browser local AI

How browser-based local LLMs work

Browser AI is powerful, but it needs careful UX: manual model loading, progress states, fallback messages and device-dependent claims.

Try Local Prompt Tester Check Local AI compatibility Back to Atlas

The simple explanation

A browser-local LLM downloads model files, caches them in the browser, and runs inference on the user device. WebGPU can accelerate the computation when the browser and hardware support it.

What users must know

First run can be slow. Some mobile devices may fail. A model download can be large. The prompt is not sent to Bluesky Labs servers during a local run, but model files may come from external model hosts.

UX requirements

Never auto-download a model on page load. Show WebGPU support first. Provide progress, errors, reset, and a no-model fallback such as a rule-based prompt checklist.

Safe public language

Use local mode, browser cache, device-dependent, estimated, and not sent to our server. Avoid 100% private, guaranteed output quality, offline forever, or runs every model.

Bluesky implementation

The Local Prompt Tester starts as a manual WebLLM MVP with one small recommended model path, streaming output, local metrics and share-card export.

Model storage & deletion

Model files are cached inside your browser's site storage (Cache Storage, IndexedDB, or OPFS) for bluesky-labs.com. They are not saved to your normal Downloads folder. To delete them, clear site data for bluesky-labs.com in your browser settings (DevTools → Application → Storage → Clear site data).

Editorial note

This guide is an implementation-oriented overview, not a benchmark guarantee. Browser-local AI behavior changes by browser, GPU, memory, model, cache and network conditions. Keep public claims conservative and test on real devices before launch.