LLamaStack 0.1.6
Prefix Reserveddotnet add package LLamaStack --version 0.1.6
NuGet\Install-Package LLamaStack -Version 0.1.6
<PackageReference Include="LLamaStack" Version="0.1.6" />
paket add LLamaStack --version 0.1.6
#r "nuget: LLamaStack, 0.1.6"
// Install LLamaStack as a Cake Addin #addin nuget:?package=LLamaStack&version=0.1.6 // Install LLamaStack as a Cake Tool #tool nuget:?package=LLamaStack&version=0.1.6
LLamaStack.Core - High-Level Services for .NET Applications
LLamaStack.Core is a library that provides higher-level services for use in .NET applications. It offers extensive support for features such as dependency injection, .NET configuration implementations, ASP.NET Core integration, and IHostedService support.
Getting Started
.NET Core Registration
You can easily integrate LLamaStack.Core into your application services layer. This registration process sets up the necessary services and loads the appsettings.json
configuration.
Example: Registering LLamaStack
builder.Services.AddLLamaStack();
By default, LLamaStack uses GUIDs for session identification. However, you have the flexibility to change this behavior by specifying a different identifier type during registration.
Example: Registering LLamaStack with an Integer Identifier
builder.Services.AddLLamaStack<int>();
Example: Registering LLamaStack with a String Identifier
builder.Services.AddLLamaStack<string>();
Alternatively, if you're injecting your own derived services, you can register each LLamaStack component separately, providing full customization options.
var llamaStackConfiguration = ConfigManager.LoadConfiguration();
builder.Services.AddSingleton(llamaStackConfiguration);
builder.Services.AddHostedService<ModelLoaderService>();
builder.Services.AddSingleton<IModelService, ModelService>();
builder.Services.AddSingleton<IModelSessionService<T>, ModelSessionService<T>>();
builder.Services.AddSingleton<IModelSessionStateService<T>, ModelSessionStateService<T>>();
Features
ModelService
The ModelService
is a service that handles the loading, unloading, and caching of models. It offers support for both single and multiple models in memory and includes model preloading capabilities.
//Load a model
var modelConfig = new ModelConfig { Name = "MyModel", ModelPath = "Path to model" };
var model = await _modelService.LoadModel(modelConfig)
//Unload a model
await _modelService.UnloadModel("MyModel");
//Load all models in the appsettings.json
await _modelService.LoadModels();
//Unload all models
await _modelService.UnloadModels();
//Get a model loaded model
var model = await _modelService.GetModel("ModelName");
//Create a context
var context = await _modelService.CreateContext("ModelName", "MyContext");
//Remove a context
await _modelService.RemoveContext("ModelName", "MyContext");
//Load a model and create a context
var (model, context) = await _modelService.GetOrCreateModelAndContext("ModelName", "sessionId");
ModelSessionService
The ModelSessionService
is responsible for creating and managing model context sessions. It supports both streaming inference and completed inference results, making it versatile for various use cases.
// Session identifier
var sessionId = "sessionId";
// SessionConfig, model, initial prompt, anti-prompts etc.
var sessionConfig = new SessionConfig
{
Model = "MyModel",
Prompt = "Initial prompt",
InferenceType = InferenceType.Instruct
};
// InferenceConfig
var inferenceConfig = new InferenceConfig
{
Temperature = 0.6f,
Mirostat = MirostatType.Mirostat2
};
// CreateAsync, create a new session
var session = await _modelSessionService.CreateAsync(sessionId, sessionConfig, inferenceConfig);
// CloseAsync, End the session
await _modelSessionService.CloseAsync(sessionId);
//Inference Examples:
var questionText = "What is .NET Core?";
// InferAsync, returns IAsyncEnumerable<InferTokenModel> for streaming output of tokens
IAsyncEnumerable<InferTokenModel> inferTokens = _modelSessionService.InferAsync(sessionId, questionText);
await foreach (var token in inferTokens)
{
Console.Write(token.Id);
Console.Write(token.Logit);
Console.Write(token.Probability);
Console.Write(token.Content);
Console.Write(token.IsChild);
}
//Note: If a character has more than one token the first will contain the entire content
//the others are marked as IsChild and will only contain the Id, Logit and Probability
// InferTextAsync, returns IAsyncEnumerable<string> for streaming output of tokens
IAsyncEnumerable<string> inferTokens = _modelSessionService.InferTextAsync(sessionId, questionText);
await foreach (var token in inferTokens)
{
Console.Write(token);
}
// InferTextCompleteAsync, returns the full inference response as a string
string inferResponse = await _modelSessionService.InferTextCompleteAsync(sessionId, questionText);
// InferTextCompleteQueuedAsync, queues the request for inference, the task will wait until all other queue items
// have processed and returns the full inference response as a string, be sure to set appropriate timeouts
// as inference can take time and the quue coule be long
string inferResponse = await _modelSessionService.InferTextCompleteQueuedAsync(sessionId, questionText);
// CancelAsync, cancel the currently running inference
await _modelSessionService.CancelAsync(sessionId);
ModelSessionStateService
The ModelSessionStateService
plays a crucial role in loading and saving a model's session state. This includes context state, inference state, and session/inference parameters. It provides robust state management for your LLama model sessions.
ModelLoaderService
The ModelLoaderService
is a hosted service designed to efficiently manage the preloading of models for applications. When configured, this service will preload models during application startup, ensuring optimal performance.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
-
net7.0
- LLamaSharp (>= 0.8.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 7.0.0)
- Microsoft.Extensions.Hosting.Abstractions (>= 7.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 7.0.1)
- System.Linq.Async (>= 6.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.