Firstly, why would you want your own AI server?
Itâs Exciting
People who call AI a dangerous new species we donât understand seem to forget that we grew up on a cartoon about a 10-year-old kid wandering around with an electric hamster whose entire vocabulary was just its own name. Some folks buy pit bulls because they look intimidatingâdonât mean to brag but my new âpetâ is supposedly destined to replace us all.
Financial Considerations
Asking ChatGPT-4o âHow many free tokens do I get a day?â followed by âHow many tokens did I use on this conversation?â already eats up about 7.5% of the 5-hour window allowanceâjust for two short answers. On my local setup, I can ask as many questions as I want without watching a meter tick down. That kind of freedom on ChatGPT costs $200/month.
Additionally, I convinced my wife that buying a new GPU was a âcareer investmentâ so now I can play Cyberpunk on Ultra details. My career as a futuristic mercenary is indeed flourishing.
Privacy & Data Control
Running AI locally keeps all your prompts and data on your own hardware. That means less risk of leaks, breaches, or third-party snoopingâcritical for personal projects, businesses with confidential data, or compliance with laws like GDPR. Even Sam Altman, CEO of OpenAI, admitted ChatGPT prompts can be used against you in court. Letâs just say⌠virtual snitches donât get stitches.
Customization & Flexibility
A local AI can be shaped exactly to your needsâfine-tuned to specific tasks, datasets, or workflows. You can adjust parameters, integrate niche tools, or build features cloud AIs wonât allow. A few tweaks andâvoilĂ âmy kids canât use AI in all the ways I would have at their age. This post might be partially written by AI, but my childâs essays? Old fashioned sweat, tears and terrible grammar.
Offline Access
Local AI doesnât need the internet. Perfect for remote work, secure facilities, or unreliable networks. Personally, I rarely leave my houseâbut if youâre a traveler, you could run it on a laptop at 30,000 feet.
Avoiding Vendor Lock-In
Cloud providers can change prices, restrict models, or limit usage. Your own setup means independenceâno surprise policy changes, no disappearing features. Just an occasional answer in Chinese to broaden your horizons.
Democratization of AI
Itâs important that such a powerful technology isnât just used and developed by the chosen few. Power tends to corrupt, and absolute power corrupts absolutelyâself-hosting is a tiny step towards decentralization of AI.
Learning & Experimentation
This is the most convincing reason for me. Running a local AI isnât just usefulâitâs a hands-on way to explore the fundamentals of AI itself.

Fancy a franchise?
When I realized I could set up an AI server at home, I knew I had to do it. The use case can come later. Ollamaâs first version dropped back in January â24, so yeahâI might be a little late to the party but Iâve done my fair share of partying in my 20s and I know thatâs actually the best move: you skip the awkward introductions and everybodyâs already having fun â you just need to get on their level.
If it comes to requirements, all you need is a computer really. I started playing with AI around the same time when I was buying a new PC but technically you donât even need a GPU to run some small models. That said, the better your GPU, the more fun you'll have.
And just to set expectations: I never planned for my home setup to replace GPT or Grok â those are the Michelin-star restaurants at the top of the culinary food chain, and youâd be foolish not to visit them when you want the very best.
My AI server is an all-you-can-eat diner, allowing you to stuff your face when no one is looking. You're not going to cry when some of it falls to the floor. Screw it, throw something and see what happens.
Meet our crew:
Ollama is the kitchen staff. They donât need to be versed in customer service and thereâs no need for them to look presentable. They just cook. The food comes out hot and ready, but you probably donât want to eat it straight out of the pot.
Open WebUI is the waiter. Sleek, polished, and attentive. He takes what the kitchen is churning out and makes it an experience. Special requests? No problem. Unsure what to order? Heâll recommend the chefâs special, maybe even something you didnât know you wanted.
Docker is the staffing agency. Do you care about the waiterâs work visa, contract type, or how his paycheck gets handled? Of course not. The agency makes sure he shows up, ready to serve, and that he stays happy enough to keep doing the job.
My AI server might not replace the five-star cloud giants, but itâs an endless buffet where Iâm both the chef and the customer.
Now grab a plateâweâre cooking.

đŞ Installation on Windows
Many tutorials I came across during my research suggest that Ollamaâprimarily designed for Linuxâperforms better when installed within a WSL2 VM. However, given the rapid pace of development in the local AI space (with older Windows versions of Ollama previously labeled as BETA), I suspected this guidance might be outdated. Itâs possible that even newer tutorials are just echoing old information.
To test this, I experimented with both setups: Ollama installed in a WSL2 VM and a native Windows installation. Interestingly, the performance difference was relatively minor (97 vs. 93 average tokens/s using Llama3.1:8b) and it was actually the native Windows setup that slightly outperformed the WSL2 configuration, suggesting that the conventional wisdom about WSL2 superiority may no longer hold.
For those who want to test it out on their system follow the steps for Linux installation within the WSL2 VM.
Ollama
Go to ollama.com/download/windows and press download to get the installer. Once itâs installed you can open the Ollama app, choose one of the models from the dropdown menu and write anything to kick off the model download. Right, Bobâs your uncle, thank you for tuning in.
On a more serious note, we still want some of the bells and whistles offered by self-hosting a local AI but I genuinely think itâs impressive that over the past few months Ollama made free of charge access to a basic local LLM with a UI this simple. It even supports some multimodal input like PDFs and images.
If the model that you want isnât in the dropdown menu you can go to ollama.com/search, click the model and version that interests you, copy the command (for example ollama run llama3.1:8b) and paste it into Powershell.
Docker
Go to docker.com, select the AMD version, download and install it. ARM is mainly for phones and tablets.
Docker is recommending to use WSL and you can install it using this command in Powershell:
wsl --install
Youâll be prompted to create an Admin account for the newly created Ubuntu instance in WSL.
Open WebUI
A good UI makes the experience similar to cloud-based AI models and allows functionalities like secure role-based access, seamless local and web-enhanced Retrieval Augmented Generation (RAG), integrated web browsing, dynamic image generation, and the ability to collaborate across multiple models for richer, more personalized chat experiences.
Download & run the container by pasting one of below Powershell commands:
To run Open WebUI with Nvidia GPU support:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
If you donât have an Nvidia GPU:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
After installation, you can access Open WebUI at localhost:3000. You will be asked to create an account â the first account becomes the Admin.

đ§ Installation for Linux
Ollama
Paste this command in Bash:
curl -fsSL https://ollama.com/install.sh | sh
Go to ollama.com/search, select your model and copy paste the command into Bash:
Example:
ollama run llama3.1
Docker
Set up Docker's apt repository:
# Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update
Then install the Docker packages:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Open WebUI
To run Open WebUI with Nvidia GPU support:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
If you donât have an Nvidia GPU:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
After installation, you can access Open WebUI at localhost:3000. You will be asked to create an account â the first account becomes the Admin.