Step-by-Step Guide to Install DeepSeek V3.1 Locally

digimon99 digimon99
Blog
Step-by-Step Guide to Install DeepSeek V3.1 Locally

Hardware Requirements

Hardware Component Minimum Recommended for Full Model
CPU Intel i7 / AMD equivalent High-end multi-core CPU
GPU NVIDIA GPU with 16GB VRAM (e.g., RTX 3080) Multi-GPU high-memory setups (H100/H800 or many Ada/Hopper GPUs)
RAM 32 GB 128 GB or more
Storage 50 GB free Hundreds of GB NVMe fast storage
OS Linux preferred (can run on macOS/Windows with compatibility layers or VMs) Same

1. Clone DeepSeek V3 Repository and Prepare Environment (Linux example)

git clone https://github.com/deepseek-ai/DeepSeek-V3.git
cd DeepSeek-V3/inference
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

The repo provides pinned versions of PyTorch/triton/transformers for compatibility[1][7].


2. Download Model Weights


3. (Optional) Convert Weights for Demo or Custom Setup

python convert.py --hf-ckpt-path /path/to/DeepSeek-V3 --save-path /path/to/DeepSeek-V3-Demo --n-experts 256 --model-parallel 16

This step is provided in the repo for multi-node or demo setups[1].


4. Run DeepSeek V3.1 Locally

Single-node interactive demo (example command):

torchrun --nnodes 1 --nproc-per-node 8 generate.py --ckpt-path /path/to/DeepSeek-V3-Demo --config configs/config_671B.json --interactive --temperature 0.7 --max-new-tokens 200

For multi-node setups, follow the cluster example on GitHub[1][7].


5. Using Ollama on Linux/macOS/Windows (Simplified Method)

  • Install Ollama (a local AI runtime):
curl -fsSL https://ollama.com/install.sh | sh
  • Run DeepSeek V3.1 through Ollama:
ollama run deepseek-v3.1:671b
  • To keep it running as a background service:
ollama serve
  • Test interaction directly in terminal or via local API:
curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-v3.1",
  "messages": [{"role": "user", "content": "Hello, DeepSeek!"}],
  "stream": false
}'

This method downloads the model automatically and requires less manual setup[2][3][4].


6. WebUI Setup

  • Use Open WebUI to have an easy web interface for DeepSeek:
    • Install Open WebUI via their docs.
    • Run ollama serve to power the backend.
    • Access the WebUI in your browser through the configured local address.

Open WebUI has a tutorial to run DeepSeek V3.1 replacing older versions with updated quant files[3].


7. Testing and Benchmarking

  • Basic test: Try chatting with the model via terminal or WebUI to confirm it's responsive.
  • Benchmark:
    • Use system resource monitoring tools (e.g., nvidia-smi for GPU, htop for CPU/RAM).
    • Measure tokens generated per second (TPS) and response latency.
    • For GPU stress: use glmark2 or dedicated deep learning benchmarks.
    • For CPU/memory: use stress-ng as needed.

Summary of Top 3 Benefits of Installing DeepSeek V3.1 Locally

  1. Full Data Privacy and Security: No external server involved, keeping sensitive inputs safe.
  2. Better Performance Control: Optimize model speed and response time with local hardware.
  3. Customizability and Offline Availability: Run on your own infrastructure without internet dependency, customize freely[2][3][4].

How to Run a Local LLM: Complete Guide to Setup & Best Models (2025)
n8n Blogn8n Blog

How to Run a Local LLM: Complete Guide to Setup & Best Models (2025)

Learn how to run LLMs locally, explore top tools like Ollama & GPT4All, and integrate them with n8n for private, cost-effective AI workflows.

LM Studio - Local AI on your computer
LM StudioLM Studio

LM Studio - Local AI on your computer

Run local AI models like gpt-oss, Llama, Gemma, Qwen, and DeepSeek privately on your computer.

Ollama
ollamaollama

Ollama

Get up and running with large language models.

Sources

Comments (0)

U
Press Ctrl+Enter to post

No comments yet

Be the first to share your thoughts!