Key Concepts in inferia
inferia is a powerful platform for deploying and managing generative AI models. This guide covers the core concepts you need to effectively utilize inferia capabilities for building, deploying, and scaling AI-powered applications.
Core Resources
Account
Your inferia account serves as the top-level resource under which all other components are managed, including quotas and billing.
Developer Accounts: Automatically assigned an account ID based on the sign-up email.
Enterprise Accounts: Can customize a unique account ID for better identification and branding.
User
A user is an email address linked to an account, with full access to manage resources, including creating, modifying, and deleting deployments, models, and datasets.
Enterprise Feature: Role-Based Access Control (RBAC) allows administrators to assign specific permissions to users.
Model
A model consists of trained weights and metadata, but it must be deployed before it can be used for inference.
Base Models: Pre-trained models designed for general AI tasks.
LoRA Add-ons: Fine-tuned models that adapt a base model for specialized use cases.
📌 Refer to the Models Overview for more details.
Deployment
A deployment is a set of model servers that host a base model and, optionally, one or more LoRA add-ons.
Serverless Deployments: Automatically scale to match workload demands.
Custom Deployments: Tailor performance, scalability, and cost for specific use cases.
Deployed Model
A deployed model is a base model or LoRA add-on that has been loaded into a deployment, making it ready for inference requests.
Dataset
A dataset is a structured, versioned collection of training examples used to fine-tune models.
Fine-Tuning Job
A fine-tuning job is the process of training a model using a dataset to create a LoRA add-on, enabling better performance for specific tasks.
Resource Naming and Identification
Each resource in inferia has a unique identifier. The full resource name follows this structure:
bash accounts/{account_id}/models/{model_id}
Where:
{account_id}
is your account identifier.{model_id}
is the unique model identifier.
Resource ID Rules
Must be 1-63 characters long.
Can contain lowercase letters (a-z), numbers (0-9), and hyphens (-).
Cannot start or end with a hyphen (-).
Some APIs require the full resource name, while others accept only the resource ID when context is clear.
Control Plane vs. Data Plane
inferia API is divided into two primary components:
Control Plane
Manages the lifecycle of resources such as accounts, models, deployments, and datasets.
Creating deployments
Uploading datasets
Initiating fine-tuning jobs
Data Plane
Handles real-time inference requests, including:
Processing inputs
Generating predictions
Returning outputs
This separation ensures efficient resource management and high-performance inference.
Interfaces to Access inferia
inferia provides multiple ways to interact with the platform, ensuring flexibility for different workflows:
Command-Line Interface (CLI)
inferia CLI (dashctl
) enables resource management directly from the terminal.
Example:
bash dashctl create deployment --model-id my-model --type serverless
Python SDK
A developer-friendly way to integrate inferia into applications. Compatible with the OpenAI API.
Installation:
bash pip install dashflow-sdk
Example Usage:
python from dashflow.client import DashFlow
client = DashFlow(api_key="<DASHFLOW_API_KEY>")
response = client.chat.completions.create(
model="accounts/my-account/models/my-model",
messages=[{
"role": "user",
"content": "Explain how DashFlow works.",
}],
)
print(response.choices[0].message.content)
Web Interface
The DashFlow Web Console offers a user-friendly graphical interface for managing resources, monitoring usage, and experimenting with models in the Model Playground.
REST API
For advanced users, DashFlow provides a comprehensive REST API covering resource management, fine-tuning, and inference.
Additional Features
Scalability & Performance
inferia is built to scale, from small experiments to large-scale applications. Serverless deployments automatically adjust to demand, optimizing cost and performance.
Security & Compliance
inferia ensures robust security and compliance, including: 🔒 End-to-end encryption (data in transit and at rest) 🔑 Role-Based Access Control (RBAC) for enterprise users 📜 Compliance with industry standards (SOC 2, HIPAA, GDPR)
Monitoring & Analytics
Gain insights with detailed metrics, track model performance, and optimize costs.
Community & Support
🚀 Join the inferia Community to collaborate, share knowledge, and stay updated. 📞 Enterprise users get priority support and dedicated account managers.
Get Started with inferia Today!
inferia is designed for efficiency, scalability, and ease of use—whether you're developing AI prototypes or deploying enterprise-grade solutions.
📚 Explore the full documentation to unlock advanced tools and workflows!
Last updated