Deploying DeepSeek-R1 on Nebula Block Instance

This tutorial will guide you through the process of deploying DeepSeek-R1 using an instance on Nebula Block. By following these steps, you will have a fully operational DeepSeek-R1-Distill-Qwen-1.5B model running on a GPU-powered instance, accessible via web interface.

1. Setting Up a Serverless Instance on Nebula Block
Nebula Block provides a serverless environment that allows you to deploy and manage instances without worrying about the underlying infrastructure. Follow these steps to set up your instance:
1.1 Sign Up & Create an Instance
Create an Account: Visit Nebula Block and sign up for a new account if you don’t already have one. New users receive $10 in initial credits with referral code (you could reach us to gain one if you don’t have any) to explore the platform.
Upgrade Account: Deposite $10 to upgrade to Engineer Tier 3 to unlock GPU Instances. For more details, visit Nebula Block Tier Overview.
Deploy an Instance: Navigate to the Instances section in the left panel.
Click on the “Deploy” button to create a new instance.
1.2 Configure the Instance
GPU Selection:
Choose a GPU instance type based on your model’s requirements. For the DeepSeek-R1-Distill-Qwen-1.5B model, we recommend using an RTX 4090 GPU.
Operating System:
Select Ubuntu Server 22.04 LTS R550 CUDA 12.4 with Docker as the operating system. This ensures compatibility with GPU-accelerated workloads.
SSH Public Key:
Add your SSH public key for secure access to the instance. If you don’t have one, you can generate it using tools like ssh-keygen and then use the “+” button to save it
Instance Name:
Assign a name to your instance for easy identification.
Deploy:
Once all fields are filled, click “Deploy”. Ensure your account has sufficient credits to proceed.
2. Accessing the Instance
After deployment, you can view the instance details in the Instances section.
Use the provided Public IP and your SSH key to log into the instance via a terminal:
ssh -i /path/to/your/private_key.pem username@PUBLIC_IP
Here’s what each part means:
-i /path/to/your/private_key.pem
: Specifies the path to your SSH private key (replace with the actual path to your key).username
: Replace with the appropriate username for your instance (e.g., ubuntu for Ubuntu instances or ec2-user for Amazon EC2 instances).PUBLIC_IP
: Replace with the Public IP address of your instance.
3. Checking GPU Drivers and Docker Service
Once logged in, verify that the GPU drivers and Docker service are running correctly:
3.1 Check GPU Status
nvidia-smiIf the output displays GPU details, the drivers are correctly installed.
3.2 Check Docker Status
systemctl status dockerEnsure that the Docker service is active and running.
4. Deploying the DeepSeek-R1 Model with vLLM API
We’ll use vLLM (Vectorized Low-Latency Model Serving) to deploy the DeepSeek-R1-Distill-Qwen-1.5B model. This service will automatically pull and load the model.
Run the following Docker command to start the service:
docker run -dit — name vllm — restart=always — gpus all -p 38000:8000 swanchain254/vllm-deepseekEnvironment Variables: You can specify additional environment variables for the container:
MODEL_NAME
: The model to run (default is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).HUGGING_FACE_HUB_TOKEN
: Your Hugging Face token (required for some models).
Example:
docker run -dit — name vllm — restart=always — gpus all -p 38000:8000 \ -e MODEL_NAME=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \ -e HUGGING_FACE_HUB_TOKEN=your_token_here \ swanchain254/vllm-deepseekMonitor Logs: To check if the service is running properly, view the logs:
docker logs -f vllmYou should see logs similar to the screenshot below, indicating that the model has been successfully loaded and the API is now ready to accept requests.
5. Setting Up OpenWebUI for Interactive Use
To interact with the model via a web interface, deploy OpenWebUI and connect it to the vLLM API.
5.1 Deploy OpenWebUI
Run the following Docker command to start OpenWebUI:
docker run -d -p 43000:8080 \ -e OPENAI_API_BASE_URL=http://PUBLIC_IP:38000/v1 \ -v /opt/webui:/app/backend/data \ — name open-webui \ — restart always \ ghcr.io/open-webui/open-webui:mainReplace PUBLIC_IP
with the IP address of your Nebula Block instance.
Note: If the image is not found locally, it will be automatically pulled from the remote registry.
5.2 Access the Web Interface
Open your browser and navigate to:
http://PUBLIC_IP:43000
You can now interact with the DeepSeek-R1-Distill-Qwen-1.5B model through the OpenWebUI interface.
Conclusion
You have successfully deployed DeepSeek-R1-Distill-Qwen-1.5B on a Nebula Block instance. The model is now accessible via web interface, making it easy to interact with and integrate into your applications. Happy coding!
Follow Us for the latest updates via our official channels:
- Website: nebulablock.com
- Twitter: @nebulablockdata
- Discord: Join the Community
- Blog: https://www.nebulablock.com/blog
- Medium: https://nebulablock.medium.com
- LinkedIn: https://www.linkedin.com/company/nebula-block
- YouTube: https://youtube.com/channel/UCkiFox7uP-vKn-ZSpFomz2A