Key Concepts in inferia

inferia is a powerful platform for deploying and managing generative AI models. This guide covers the core concepts you need to effectively utilize inferia capabilities for building, deploying, and scaling AI-powered applications.


Core Resources

Account

Your inferia account serves as the top-level resource under which all other components are managed, including quotas and billing.

  • Developer Accounts: Automatically assigned an account ID based on the sign-up email.

  • Enterprise Accounts: Can customize a unique account ID for better identification and branding.

User

A user is an email address linked to an account, with full access to manage resources, including creating, modifying, and deleting deployments, models, and datasets.

  • Enterprise Feature: Role-Based Access Control (RBAC) allows administrators to assign specific permissions to users.

Model

A model consists of trained weights and metadata, but it must be deployed before it can be used for inference.

  • Base Models: Pre-trained models designed for general AI tasks.

  • LoRA Add-ons: Fine-tuned models that adapt a base model for specialized use cases.

📌 Refer to the Models Overview for more details.

Deployment

A deployment is a set of model servers that host a base model and, optionally, one or more LoRA add-ons.

  • Serverless Deployments: Automatically scale to match workload demands.

  • Custom Deployments: Tailor performance, scalability, and cost for specific use cases.

Deployed Model

A deployed model is a base model or LoRA add-on that has been loaded into a deployment, making it ready for inference requests.

Dataset

A dataset is a structured, versioned collection of training examples used to fine-tune models.

Fine-Tuning Job

A fine-tuning job is the process of training a model using a dataset to create a LoRA add-on, enabling better performance for specific tasks.


Resource Naming and Identification

Each resource in inferia has a unique identifier. The full resource name follows this structure:

bash accounts/{account_id}/models/{model_id}

Where:

  • {account_id} is your account identifier.

  • {model_id} is the unique model identifier.

Resource ID Rules

  • Must be 1-63 characters long.

  • Can contain lowercase letters (a-z), numbers (0-9), and hyphens (-).

  • Cannot start or end with a hyphen (-).

Some APIs require the full resource name, while others accept only the resource ID when context is clear.


Control Plane vs. Data Plane

inferia API is divided into two primary components:

Control Plane

Manages the lifecycle of resources such as accounts, models, deployments, and datasets.

  • Creating deployments

  • Uploading datasets

  • Initiating fine-tuning jobs

Data Plane

Handles real-time inference requests, including:

  • Processing inputs

  • Generating predictions

  • Returning outputs

This separation ensures efficient resource management and high-performance inference.


Interfaces to Access inferia

inferia provides multiple ways to interact with the platform, ensuring flexibility for different workflows:

Command-Line Interface (CLI)

inferia CLI (dashctl) enables resource management directly from the terminal.

Example:

bash dashctl create deployment --model-id my-model --type serverless

Python SDK

A developer-friendly way to integrate inferia into applications. Compatible with the OpenAI API.

Installation:

bash pip install dashflow-sdk

Example Usage:

python from dashflow.client import DashFlow

client = DashFlow(api_key="<DASHFLOW_API_KEY>")

response = client.chat.completions.create(
    model="accounts/my-account/models/my-model",
    messages=[{
        "role": "user",
        "content": "Explain how DashFlow works.",
    }],
)

print(response.choices[0].message.content)

Web Interface

The DashFlow Web Console offers a user-friendly graphical interface for managing resources, monitoring usage, and experimenting with models in the Model Playground.

REST API

For advanced users, DashFlow provides a comprehensive REST API covering resource management, fine-tuning, and inference.


Additional Features

Scalability & Performance

inferia is built to scale, from small experiments to large-scale applications. Serverless deployments automatically adjust to demand, optimizing cost and performance.

Security & Compliance

inferia ensures robust security and compliance, including: 🔒 End-to-end encryption (data in transit and at rest) 🔑 Role-Based Access Control (RBAC) for enterprise users 📜 Compliance with industry standards (SOC 2, HIPAA, GDPR)

Monitoring & Analytics

Gain insights with detailed metrics, track model performance, and optimize costs.

Community & Support

🚀 Join the inferia Community to collaborate, share knowledge, and stay updated. 📞 Enterprise users get priority support and dedicated account managers.


Get Started with inferia Today!

inferia is designed for efficiency, scalability, and ease of use—whether you're developing AI prototypes or deploying enterprise-grade solutions.

📚 Explore the full documentation to unlock advanced tools and workflows!

Last updated