Step-by-Step Guide to Install DeepSeek V3.1 Locally

Hardware Requirements

Hardware Component	Minimum	Recommended for Full Model
CPU	Intel i7 / AMD equivalent	High-end multi-core CPU
GPU	NVIDIA GPU with 16GB VRAM (e.g., RTX 3080)	Multi-GPU high-memory setups (H100/H800 or many Ada/Hopper GPUs)
RAM	32 GB	128 GB or more
Storage	50 GB free	Hundreds of GB NVMe fast storage
OS	Linux preferred (can run on macOS/Windows with compatibility layers or VMs)	Same

1. Clone DeepSeek V3 Repository and Prepare Environment (Linux example)

git clone https://github.com/deepseek-ai/DeepSeek-V3.git
cd DeepSeek-V3/inference
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

The repo provides pinned versions of PyTorch/triton/transformers for compatibility[1][7].

2. Download Model Weights

Download DeepSeek V3.1 weights from the official Hugging Face page:
https://huggingface.co/deepseek-ai/DeepSeek-V3.1
Place the downloaded weights under /path/to/DeepSeek-V3 as per the repo instructions[1][7].

3. (Optional) Convert Weights for Demo or Custom Setup

python convert.py --hf-ckpt-path /path/to/DeepSeek-V3 --save-path /path/to/DeepSeek-V3-Demo --n-experts 256 --model-parallel 16

This step is provided in the repo for multi-node or demo setups[1].

4. Run DeepSeek V3.1 Locally

Single-node interactive demo (example command):

torchrun --nnodes 1 --nproc-per-node 8 generate.py --ckpt-path /path/to/DeepSeek-V3-Demo --config configs/config_671B.json --interactive --temperature 0.7 --max-new-tokens 200

For multi-node setups, follow the cluster example on GitHub[1][7].

5. Using Ollama on Linux/macOS/Windows (Simplified Method)

Install Ollama (a local AI runtime):

curl -fsSL https://ollama.com/install.sh | sh

Run DeepSeek V3.1 through Ollama:

ollama run deepseek-v3.1:671b

To keep it running as a background service:

ollama serve

Test interaction directly in terminal or via local API:

curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-v3.1",
  "messages": [{"role": "user", "content": "Hello, DeepSeek!"}],
  "stream": false
}'

This method downloads the model automatically and requires less manual setup[2][3][4].

6. WebUI Setup

Use Open WebUI to have an easy web interface for DeepSeek:
- Install Open WebUI via their docs.
- Run ollama serve to power the backend.
- Access the WebUI in your browser through the configured local address.

Open WebUI has a tutorial to run DeepSeek V3.1 replacing older versions with updated quant files[3].

7. Testing and Benchmarking

Basic test: Try chatting with the model via terminal or WebUI to confirm it's responsive.
Benchmark:
- Use system resource monitoring tools (e.g., nvidia-smi for GPU, htop for CPU/RAM).
- Measure tokens generated per second (TPS) and response latency.
- For GPU stress: use glmark2 or dedicated deep learning benchmarks.
- For CPU/memory: use stress-ng as needed.

Summary of Top 3 Benefits of Installing DeepSeek V3.1 Locally

Full Data Privacy and Security: No external server involved, keeping sensitive inputs safe.
Better Performance Control: Optimize model speed and response time with local hardware.
Customizability and Offline Availability: Run on your own infrastructure without internet dependency, customize freely[2][3][4].