Newsroom

Introducing L3-8B Stheno v3.2 on Nebula Block — Free Inference for All

Hayden Nguyen

30 Jul 2025 • 3 min read

Meet L3-8B Stheno v3.2 — Fast, Smart, and Free on Nebula Block

Built by the community and optimized for performance, L3-8B Stheno v3.2, developed by Sao10K, is the newest member of the instruction-tuned LLM family — now available for free via Nebula Block’s high-speed serverless inference. Whether you're building an AI assistant, tutoring app, or intelligent agent, Stheno is ready for real-time, multi-turn deployments with strong reasoning capability.

Model Overview:

Model	Architecture	Context Length	Quantized Size	Hosted On
Stheno v3.2	8B Dense	8K tokens	5 GB (GGUF)	Hugging Face

Built on LLaMA-3-8B base.
Tuned on open assistant-like data, multi-turn conversations, and reasoning tasks.
Available in various GGUF quantization levels for faster edge deployment.
Inference-ready on RTX GPUs using Nebula Block.

Why Stheno v3.2?

Stheno excels at:

Thoughtful, multi-step reasoning.
Instruction-following across common use cases.
Maintaining dialogue coherence over longer contexts (up to 8K tokens).
Being fast and memory-efficient — ideal for small batch or real-time workloads.

Compared to other 7B or 13B open models, Stheno stands out in handling multi-step instructions and chat coherence without sacrificing latency. It’s particularly effective in scenarios where you need fast but smart LLMs — like API agents, customer chat, or educational tutors.

Who should use Stheno v3.2?

Indie developers building fast, low-cost LLM tools.
Startups experimenting with reasoning agents or chatbots.
Researchers needing a smaller but high-context model for language or logic tasks.
Students and educators who want to explore LLMs with a gentle resource footprint.

How to Access & Use Stheno v3.2 on Nebula Block

Step-by-Step Guide

Signup then go to: Serverless Models
→ Select L3 8B Stheno v32.
Interact via Web UI

You can manage the Output in the right box

(Optional) Use in Your App:
Nebula provides simple curl, Python, or JavaScript snippets to embed the model into your tool or frontend.

Copy:

API Endpoint: https://inference.nebulablock.com/v1/chat/completions
API Key
Model name: Sao10K/L3-8B-Stheno-v3.2

Then paste it to your app, you can use Stheno v3.2 through your app with Nebula Block's API.

Why Run Stheno v3.2 on Nebula Block?

Nebula Block provides a wide range of pre-deployed inference endpoints—including DeepSeek V3 and R1 completely free (in limited time), enabling instant access to state-of-the-art large language models.

Scalability: Effortlessly scales resources based on demand, ensuring optimal performance.
Cost Efficiency: Pay-as-you-go model, allowing users to only pay for what they use.
User-Friendly Interface: Intuitive platform simplifies management and monitoring of AI models.
Comprehensive Documentation: Extensive guides help users navigate API integrations and best practices.
Robust Security: Strong security measures for data protection, including API and SSH key management.
Diverse Offerings: Provides AI models, GPU instances, and object storage in one ecosystem.
Active Support and Community: Responsive customer service and a supportive community enhance user experience.

Nebula Block is the first Canadian sovereign AI cloud, designed for performance, control, and compliance. Backed by infrastructure across Canada and globally, Nebula Block supports low-latency access for users worldwide.

TL;DR

For a quick summary of what Stheno v3.2 brings, here’s a breakdown:

Feature	Value
Architecture	LLaMA 3 - 8B
Context Length	8192 tokens
Strengths	Reasoning, multi-turn chat, fast inference
Quant Size	5GB (GGUF)
Hosted on	Nebula Block (free)
Developer	Sao10K

What’s Next?

Stay tuned as we roll out even more models, and keep experimenting with free-tier serverless inference.

Sign up and explore now.

🔍 Learn more: Visit our blog and documents for more insights or schedule a demo to optimize your search solutions.

📬 Get in touch: Join our Discord community for help or Contact Us.

🔗 Try Nebula Block now

Stay Connected

💻 Website: nebulablock.com
📖 Docs: docs.nebulablock.com
🐦 Twitter: @nebulablockdata
🐙 GitHub: Nebula-Block-Data
🎮 Discord: Join our Discord
✍️ Blog: Read our Blog
📚 Medium: Follow on Medium
🔗 LinkedIn: Connect on LinkedIn
▶️ YouTube: Subscribe on YouTube