Language model kernels

This package implements the interface to the language models. It consists of three layers:

  • the basic layer handles the language models themselves, abstracting from the details of individual vendors. This is implemented through the Langchain framework in the module .langchain.models. The primary function of this layer is to liaise between creation of Langchain objects encapsulating the exchange with the language model API and a standard specification that may be read from a config.toml file, automating user-led selection of models.

  • an intermediate layer formulates and uses specifications of language kernels. The kernels are specific uses of the language models. For example, to implement a chat, one will specify a system and a human prompt to obtain a chat kernel. These specifications are handled by the prompts module. The aim of this module is to provide a library through which simple configurations of language models may be created, supporting variations without the need to specify them at the level of the Langchain framework.

  • an implementation layer creates the kernel themselves. This is implemented again through the langchain framework in the .langchain.runnables module, which uses BaseChatModel as a key element to produce objects callable via .invoke. BaseChatModel supports tool calling and structured output, thus supporting further customization. The Langchain kernels are called 'runnables' in the implementation.

The boundary between the layers is not rigid. For example, it is possible to interact with the language model directly using basic models. Importantly, Langchain provides its own approach to building complex kernels, through "LCEL objects" built on top of Runnable. Therefore, kernels may also be implemented on top of Langchain's runnables.

The prompts module contains a number of prespecified prompts, enabling the creation of a Langchain kernel (a runnable) on the fly.

from lmm.models.langchain.runnables import create_runnable
summarizer = create_runnable("summarizer")

Here, summarizer includes a language model as specified in config.toml and a prompt to summarize text:

try:
    summary = summarizer.invoke(
        {'text': "This is some long text, reduced to a few lines here for the sake of the example"}
    )
except Exception:
    print("Could not retrieve response from model")

Basic models (Langchain)

This module implements creation of Langchain language model objects wrapping the message exchange calls of the primitive model API, thus allowing a unified programming interface to diverse models, abstracting from vendor details. The model objects are stored in two global repositories, langchain_models and langchain_embeddings.

The model objects are the lowest among the abstractions offered by LangChain/LangGraph, but they also implement the runnable interface, so that they can be called directly through .invoke/.ainvoke.

The creation of these objects connects the LangChain API to configuration settings that may be given programmatically or loaded automatically from config.toml. The settings are given as a LanguageModelSettings object, a dictionary with the appropriate fields or a spec given as argument to the create_model_from_spec function. The LanguageModelSettings is also a member of the Settings object that is read from the config file prividing a default that is loaded from the config.toml file.

Note: the abstract class in the LangChain API that defines the model object is BaseChatModel.

Examples:

from lmm.models.langchain.models import (
    create_model_from_spec,
    create_model_from_settings,
    langchain_models,
)
from lmm.config.config import LanguageModelSettings
from langchain_core.language_models.chat_models import BaseChatModel

# Method 1: Using a LanguageModelSettings object
settings = LanguageModelSettings(
    model="OpenAI/gpt-4o",
    temperature=0.7,
    max_tokens=1000,
    max_retries=3
)
model: BaseChatModel = langchain_models[settings]
# (this method accesses the repository directly).

# Method 2: Using create_model_from_settings function
model = create_model_from_settings(settings)
# (this method memoizes the model in the repository)

# Method 3: Using create_model_from_spec function
model = create_model_from_spec(model="OpenAI/gpt-4o")
# (this method memoizes the model in the repository)

# Method 4: Using dictionary unpacking
spec = {'model': "OpenAI/gpt-4o", 'temperature': 0.7}
model = create_model_from_spec(**spec)
# (this method memoizes the model in the repository)
Behaviour

Raises exception from Langchain and from itself

Note

Support for new model sources should be added here by extending the match ... case statement in _create_model_instance.

All methods support configurable parameters like temperature, max_tokens, max_retries, and timeout through LanguageModelSettings.

create_embedding_model_from_settings(settings)

Create langchain embedding model from an EmbeddingSettings object. Raises a ValueError if the source argument is not supported.

Parameters:

Name Type Description Default
settings EmbeddingSettings

an EmbeddingSettings object containing model configuration.

required

Returns:

Type Description
Embeddings

a Langchain embeddings model object.

Raises ValueError, TypeError, ValidationError

Example
settings = EmbeddingSettings(
    dense_model="OpenAI/text-embedding-3-small"
)
model = create_embedding_model_from_settings(settings)
Source code in lmm/models/langchain/models.py
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
def create_embedding_model_from_settings(
    settings: EmbeddingSettings,
) -> Embeddings:
    """
    Create langchain embedding model from an EmbeddingSettings
    object. Raises a ValueError if the source argument
    is not supported.

    Args:
        settings: an EmbeddingSettings object containing model
            configuration.

    Returns:
        a Langchain embeddings model object.

    Raises ValueError, TypeError, ValidationError

    Example:
        ```python
        settings = EmbeddingSettings(
            dense_model="OpenAI/text-embedding-3-small"
        )
        model = create_embedding_model_from_settings(settings)
        ```
    """
    return langchain_embeddings[settings]

create_embedding_model_from_spec(dense_model, *, sparse_model='Qdrant/bm25')

Create langchain embedding model from source_name and model_name. Raises a ValueError if the source_name argument is not supported.

Parameters:

Name Type Description Default
dense_model str

the model specification in the form source/model, such as 'OpenAI/text-embedding-3-small'

required

Returns:

Type Description
Embeddings

a Langchain embeddings model object.

Raises ValueError, TypeError, ValidationError

Example
spec = {'dense_model': "OpenAI/text-embedding-3-small"}
model = create_embedding_model_from_spec(**spec)
Source code in lmm/models/langchain/models.py
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
def create_embedding_model_from_spec(
    dense_model: str, *, sparse_model: SparseModel = "Qdrant/bm25"
) -> Embeddings:
    """
    Create langchain embedding model from source_name and
    model_name. Raises a ValueError if the source_name argument
    is not supported.

    Args:
        dense_model: the model specification in the form source/model,
            such as 'OpenAI/text-embedding-3-small'

    Returns:
        a Langchain embeddings model object.

    Raises ValueError, TypeError, ValidationError

    Example:
        ```python
        spec = {'dense_model': "OpenAI/text-embedding-3-small"}
        model = create_embedding_model_from_spec(**spec)
        ```
    """
    spec = EmbeddingSettings(
        dense_model=dense_model,
        sparse_model=sparse_model,
    )
    return langchain_embeddings[spec]

create_model_from_settings(settings)

Create langchain model from a LanguageModelSettings object. Raises a ValueError if the source argument is not supported.

Parameters:

Name Type Description Default
settings LanguageModelSettings

a LanguageModelSettings object containing model configuration.

required

Returns:

Type Description
BaseChatModel

a Langchain model object.

Raises ValueError, TypeError, ValidationError

Example
# Create settings explicitly.
config = LanguageModelSettings(
    model="OpenAI/gpt-4o-mini",
    temperature=0.7,
    max_tokens=1000,
    max_retries=3,
)
model = create_model_from_settings(config)
response = model.invoke("Why is the sky blue?")

# Load settings from config.toml.
settings = Settings()
system_prompt = "You are a helpful assistant."
model = create_model_from_settings(
    settings.minor,
)
response = model.invoke("Why is the grass green?")
Source code in lmm/models/langchain/models.py
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
def create_model_from_settings(
    settings: LanguageModelSettings,
) -> BaseChatModel:
    """
    Create langchain model from a LanguageModelSettings object.
    Raises a ValueError if the source argument is not supported.

    Args:
        settings: a LanguageModelSettings object containing model
            configuration.

    Returns:
        a Langchain model object.

    Raises ValueError, TypeError, ValidationError

    Example:
        ```python
        # Create settings explicitly.
        config = LanguageModelSettings(
            model="OpenAI/gpt-4o-mini",
            temperature=0.7,
            max_tokens=1000,
            max_retries=3,
        )
        model = create_model_from_settings(config)
        response = model.invoke("Why is the sky blue?")

        # Load settings from config.toml.
        settings = Settings()
        system_prompt = "You are a helpful assistant."
        model = create_model_from_settings(
            settings.minor,
        )
        response = model.invoke("Why is the grass green?")
        ```
    """
    return langchain_models[settings]

create_model_from_spec(model, *, temperature=0.1, max_tokens=None, max_retries=2, timeout=None, provider_params=None)

Create langchain model from specifications.

Parameters:

Name Type Description Default
model str

the model in the form source/model, such as 'OpenAI/gpt-4o'

required

Returns:

Type Description
BaseChatModel

a Langchain model object.

Raises ValueError, TypeError, ValidationError

Example
spec = {'model': "OpenAI/gpt-4o-mini"}
model = create_model_from_spec(**spec)
Source code in lmm/models/langchain/models.py
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
def create_model_from_spec(
    model: str,
    *,
    temperature: float = 0.1,
    max_tokens: int | None = None,
    max_retries: int = 2,
    timeout: float | None = None,
    provider_params: dict[str, MetadataPrimitiveWithList] | None = None,
) -> BaseChatModel:
    """
    Create langchain model from specifications.

    Args:
        model: the model in the form source/model, such as
            'OpenAI/gpt-4o'

    Returns:
        a Langchain model object.

    Raises ValueError, TypeError, ValidationError

    Example:
        ```python
        spec = {'model': "OpenAI/gpt-4o-mini"}
        model = create_model_from_spec(**spec)
        ```
    """
    if provider_params is None:
        provider_params = {}

    spec = LanguageModelSettings(
        model=model,
        temperature=temperature,
        max_tokens=max_tokens,
        max_retries=max_retries,
        timeout=timeout,
        provider_params=provider_params,
    )
    return langchain_models[spec]

Prompts

This module centralizes storage and definition of prompts to configure language models to perform a specific function.

These prompts are collected in a PromptDefinition object, which identifies the prompt set (system and user) uniquely, and specifies the language model tier that should be used with the prompt. The module supports predefined prompts as well as prompts that may be dynamically added to the centralized repository.

The predefined prompt sets are

- "summarizer"
- "question_generator"
- "query"
- "query_with_context"
- "context_validator"
- "allowed_content_validator"

These prompts may be retrieved from the module-level dictionary prompt_library, as shown in the example below.

Example:

```python
from lmm.models.prompts import (
    prompt_library, 
    PromptDefinition,
)
prompt_definition: PromptDefinition = prompt_library["summarizer"]
```

New prompt text templates may be added dynamically to the dictionary. To do this, one can use the create_prompt function.

Example:

```python
from lmm.models.prompts import (
    prompt_library,
    create_prompt,
)
create_prompt("Provide the questions the following text answers:\n"
    + "\nTEXT:\n{text}", name = "question_creation")
prompt_template: str = prompt_library["question_creation"]
```

There is no much added value in storing the prompt template text in the library per se. The motivation is that the prompt becomes a tool that is now available to other functions in the library, like the create_runnable function (see lmm.models.langchain.runnables). A runnable with the prompt can then be obtained by requesting it with the name of the prompt object:

```python
from lmm.models.langchain.runnables import create_runnable
lmm = create_runnable("question_creation") # langchain runnable
response = lmm.invoke({'text': "Apples are healthy food"})
```

PromptDefinition

Bases: BaseModel

Groups all properties that uniquely define a kernel tool

Source code in lmm/models/prompts.py
69
70
71
72
73
74
75
76
77
class PromptDefinition(BaseModel):
    """Groups all properties that uniquely define a kernel tool"""

    name: str
    prompt: str
    system_prompt: str | None = None
    model_tier: ModelTier = 'minor'

    model_config = ConfigDict(frozen=False, extra='forbid')

create_prompt(prompt, name, *, system_prompt=None, replace=False)

Adds a custom prompt template to the prompt dictionary.

Parameters:

Name Type Description Default
prompt str

the prompt text.

required
name str

the name of the tool. This will also define a kernel with the same name (i.e. a Langchain runnable)

required
system_prompt str | None

an optional system prompt text.

None
Source code in lmm/models/prompts.py
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
def create_prompt(
    prompt: str,
    name: str,
    *,
    system_prompt: str | None = None,
    replace: bool = False,
) -> None:
    """
    Adds a custom prompt template to the prompt dictionary.

    Args:
        prompt: the prompt text.
        name: the name of the tool. This will also define a kernel
            with the same name (i.e. a Langchain runnable)
        system_prompt: an optional system prompt text.
    """

    if name in prompt_library.keys():
        if replace:
            prompt_library.pop(name)  # type: ignore
        else:
            raise ValueError(
                f"'{name}' is already a registered prompt. "
                "Use another name to register a custom prompt."
            )

    # We abuse the lack of run-time checks for Literals here. We do
    # this because we want the availability of the preformed prompts
    # given by Literal but also the flexibility to add new prompts.
    definition = PromptDefinition(
        name=name,
        prompt=prompt,
        system_prompt=system_prompt,
    )
    prompt_library[name] = definition  # type: ignore

Simple packaged models (Langchain runnables)

Creates Langchain 'runnable' objects or 'chain' members. These objects may be combined to form Langchain chains, or used by themselves using the .invoke/.ainvoke member functions. The created objects are made available through the global variables runnable_library and embeddings_library, which memoize the objects.

Each object plugs two global resources into the Langchain interface:

  • a language model object, itself wrapped by LangChain, selected via the models module from the specification in a config.toml file.
  • a set of prompts that specialize the function of the language model, selected from the prompt library provided by the prompts module.

The prompts module contains a set of predefined prompts, allowing one to create the specialized runnable/'chain' from the prompt name.

The runnables are callable objects via the invoke member function. The LangChain syntax is used with invoke, for example by passing a dictionary that contains the parameters for the prompt template.

Example of runnable creations:

```python
from lmm.models.langchain.runnables import create_runnable
try:
    query_model = create_runnable("query")  # uses config.toml
except Exception ...

# a runnable that specifies the model directly
try:
    questions_model = create_runnable("question_generator",
                            {'model': "OpenAI/gpt-4o"})
except Exception ...

# use Langchain syntax to call the runnable after creating it
try:
    response = questions_model.invoke({
        'text': "Logistic regression is typically used when the "
            + "outcome variable is binary."
    })
except Exception:
    print("Could not obtain response from model")
```

Example of a dynamically created chat runnable:

```python
    from lmm.models.prompts import (
        prompt_library,
        create_prompt,
    )

    # this creates a prompt and registers it in the prompts library
    prompt_template = '''Provide the questions to which the text answers.
        TEXT:
        {text}
    '''
    create_prompt(prompt_template, name = "question_generator")

    # create a runnable from the major model in config.toml with
    # this prompt
    from lmm.config.config import Settings
    from lmm.models.langchain.runnables import create_runnable
    settings = Settings()
    try:
        model = create_runnable(
            "question_generator",
            settings.major,
            "You are a helpful teacher")
    except Exception ...

    # if no settings object given, defaults to settings.minor
    try:
        model_minor = create_runnable("question_generator")
    except Exception ...
```
Expected behaviour

This module raises exceptions from Langhchain and itself.

Note

A Langchain language model may be used directly after obtaining it from create_model_from_spec in the models module.

RunnableDefinition

Bases: BaseModel

Groups together all properties that define the runnable

Source code in lmm/models/langchain/runnables.py
159
160
161
162
163
164
165
166
167
168
class RunnableDefinition(BaseModel):
    """Groups together all properties that define the runnable"""

    runnable_name: PromptNames | str
    settings: LanguageModelSettings
    system_prompt_override: str | None = None
    params: RunnableParameterType = frozenset()

    # required for hashability
    model_config = ConfigDict(frozen=True, extra='forbid')

create_embeddings(settings=None)

Creates a Langchain embeddings runnable from a configuration object.

Parameters:

Name Type Description Default
settings dict[str, str] | EmbeddingSettings | Settings | None

an EmbeddingSettings object with the following fields:

  • dense_model: a specification in the form provider/ model, for example 'OpenAI/text-embedding-3-small'
  • sparse_model: a sparse model specification.

Alternatively, a dictionary with the same fields and text. If None (default), the settings will be read from the configuration file. If no configuration file exists, a settings object will be created with default parameters.

None

Returns:

Type Description
Embeddings

a Langchain object that embeds text by calling embed_documents or embed_query.

Raises:

Type Description
(ValidationError, TypeError)

for invalid spec

ImportError

for missing libraries

ConnectionError

if not online

Example:

from lmm.models.langchain.runnables import (
    create_embeddings,
)

try:
    encoder: Embeddings = create_embeddings()
    vector = encoder.embed_query("Why is the sky blue?")
    documents = ["The sky is blue due to its oxygen content"]
    vectors = encoder.embed_documents(documents)
except Exception ...
Source code in lmm/models/langchain/runnables.py
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
def create_embeddings(
    settings: (
        dict[str, str] | EmbeddingSettings | Settings | None
    ) = None,
) -> Embeddings:
    """
    Creates a Langchain embeddings runnable from a configuration
    object.

    Args:
        settings: an EmbeddingSettings object with the following
            fields:

            - dense_model: a specification in the form provider/
            model, for example 'OpenAI/text-embedding-3-small'
            - sparse_model: a sparse model specification.

            Alternatively, a dictionary with the same fields and
            text. If None (default), the settings will be read
            from the configuration file. If no configuration file
            exists, a settings object will be created with default
            parameters.

    Returns:
        a Langchain object that embeds text by calling embed_documents
            or embed_query.

    Raises:
        ValidationError, TypeError: for invalid spec
        ImportError: for missing libraries
        requests.exceptions.ConnectionError: if not online

    Example:
    ```python
    from lmm.models.langchain.runnables import (
        create_embeddings,
    )

    try:
        encoder: Embeddings = create_embeddings()
        vector = encoder.embed_query("Why is the sky blue?")
        documents = ["The sky is blue due to its oxygen content"]
        vectors = encoder.embed_documents(documents)
    except Exception ...
    ```
    """
    if not bool(settings):  # includes empty dict
        sets = Settings()
        settings = sets.embeddings
    elif isinstance(settings, Settings):
        settings = settings.embeddings
    elif isinstance(settings, dict):
        # checked by pydantic model
        settings = EmbeddingSettings(**settings)  # type: ignore

    return embeddings_library[settings]

create_kernel_from_objects(human_prompt, *, system_prompt=None, language_model=None)

Creates a Langchain runnable from a prompt template and a language settings object. This name is not registered in the prompts library; it is available directly.

Parameters:

Name Type Description Default
human_prompt str

prompt text

required
system_prompt str | None

system prompt text

None
language_model BaseChatModel | LanguageModelSettings | Settings | None

either a Langchain BaseChatModel, or a LanguageModelSettings object, or None (default). In this latter case the language.minor from the config file is used to create the model.

None

Returns:

Type Description
RunnableType

a Langchain runnable, a type aliased as RunnableType.

Example:

human_prompt = '''
Provide the questions to which the text answers.

TEXT:
{text}
'''
settings = Settings()
try:
    model = create_kernel_from_objects(
        human_prompt=human_prompt,
        system_prompt="You are a helpful assistant",
        language_model=settings.aux,
    )
except Exception ...

# model use:
try:
    response = model.invoke({'text': "Logistic regression is used"
        + " when the outcome variable is binary"})
except Exception ...
Source code in lmm/models/langchain/runnables.py
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
def create_kernel_from_objects(
    human_prompt: str,
    *,
    system_prompt: str | None = None,
    language_model: (
        BaseChatModel | LanguageModelSettings | Settings | None
    ) = None,
) -> RunnableType:
    """
    Creates a Langchain runnable from a prompt template and
    a language settings object. This name is not registered in the
    prompts library; it is available directly.

    Args:
        human_prompt: prompt text
        system_prompt: system prompt text
        language_model: either a Langchain BaseChatModel, or
            a LanguageModelSettings object, or None (default). In
            this latter case the language.minor from the config
            file is used to create the model.

    Returns:
        a Langchain runnable, a type aliased as `RunnableType`.

    Example:
    ```python
    human_prompt = '''
    Provide the questions to which the text answers.

    TEXT:
    {text}
    '''
    settings = Settings()
    try:
        model = create_kernel_from_objects(
            human_prompt=human_prompt,
            system_prompt="You are a helpful assistant",
            language_model=settings.aux,
        )
    except Exception ...

    # model use:
    try:
        response = model.invoke({'text': "Logistic regression is used"
            + " when the outcome variable is binary"})
    except Exception ...
    ```
    """
    if language_model is None:
        settings = Settings()
        language_model = create_model_from_settings(settings.minor)
        name = f"Custom:{settings.minor}"
    elif isinstance(language_model, Settings):
        name = f"Custom:{language_model.minor}"
        language_model = create_model_from_settings(
            language_model.minor
        )
    elif isinstance(language_model, LanguageModelSettings):
        name = f"Custom:{language_model}"
        language_model = create_model_from_settings(language_model)
    else:  # it's a BaseChatModel
        name = "Custom"

    # Langchain prompt
    prompt: ChatPromptTemplate
    if system_prompt is not None:
        prompt = ChatPromptTemplate.from_messages(  # type: ignore
            [
                SystemMessagePromptTemplate.from_template(
                    system_prompt
                ),
                HumanMessagePromptTemplate.from_template(
                    human_prompt
                ),
            ]
        )
    else:
        prompt = ChatPromptTemplate.from_template(human_prompt)

    # combine into a runnable
    runnable: RunnableType = prompt | language_model | StrOutputParser()  # type: ignore
    # .name is a member function of RunnableSerializable
    # inited to None, which we re-initialize here
    runnable.name = name
    return runnable

create_runnable(runnable_name, user_settings=None, system_prompt=None, **kwargs)

Creates a Langchain chain (a 'runnable') by combining tools/prompts created under the runnable_name parameters and configurations from config.toml with optional override settings.

The function maps different chains to their appropriate language model settings categories. For example, - 'query', 'query_with_context' -> major model settings - 'question_generator', 'summarizer' -> minor model settings - 'allowed_content_validator', 'context_validator' -> aux model settings

Settings Hierarchy (highest to lowest priority): 1. user_settings parameter (if provided) 2. config.toml file settings 3. Default settings from Settings class

Parameters:

Name Type Description Default
runnable_name PromptNames | str

The name of the runnable to create. If one of the supported names defined in the PromptNames literal type, returns a cached runnable object. Otherwise, looks up in the prompt_library dictionary if there is a prompt with that runnable_name, and returns a runnable object for a chat with that prompt.

required
user_settings dict[str, str] | LanguageModelSettings | Settings | None

Optional settings to override the default configuration. Can be either: - dict[str, str]: Dictionary with 'model' key - LanguageModelSettings: Pydantic model instance - None: Use settings from config.toml or defaults

None
system_prompt str | None

System prompt used in messages with the language model.

None

Returns:

Type Description
RunnableType

A KernelType object RunnableSerializable[dict[str, str], str], A Langchain runnable chain that combines a prompt template, language model, and string output parser. The chain accepts a dictionary of template variables and returns a string response.

Raises:

Type Description
ValueError

If runnable_name is not supported or if user_settings contains invalid model source names. No check is made at this stage that the model names are correct (as they frequently change); instead, failure occurs when the .invoke member function is called.

(ValidationError, TypeError)

alternative errors raised in the same circumstances as above.

ImportError

for not installed libraries.

Examples:

Create runnable with default settings read from configuration file:

try:
    runnable = create_runnable("query")
except Exception ...

Override with dictionary:

try:
    runnable = create_runnable("summarizer",
        {"model": "OpenAI/gpt-4o"})
except Exception ...

Override with settings object:

from lmm.config.config import LanguageModelSettings
settings = LanguageModelSettings(
    model="Mistral/mistral-small-latest"
)
try:
    runnable = create_runnable("question_generator", settings)
except Exception ...

The runnable object may be used with Langchain .invoke syntax:

try:
    response = runnable.invoke(
        {'text': "Logistic regression is used when the outcome"
            + " variable is binary."}
    )
except Exception ...
Source code in lmm/models/langchain/runnables.py
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
def create_runnable(
    runnable_name: PromptNames | str,
    user_settings: (
        dict[str, str] | LanguageModelSettings | Settings | None
    ) = None,
    system_prompt: str | None = None,
    **kwargs: MetadataPrimitiveWithList,
) -> RunnableType:  # RunnableSerializable[dict[str, str], str]
    """
    Creates a Langchain chain (a 'runnable') by combining tools/prompts
    created under the runnable_name parameters and configurations from
    config.toml with optional override settings.

    The function maps different chains to their appropriate
    language model settings categories. For example,
    - 'query', 'query_with_context' -> major model settings
    - 'question_generator', 'summarizer' -> minor model settings
    - 'allowed_content_validator', 'context_validator' ->
                                            aux model settings

    Settings Hierarchy (highest to lowest priority):
    1. user_settings parameter (if provided)
    2. config.toml file settings
    3. Default settings from Settings class

    Args:
        runnable_name: The name of the runnable to create. If one of
            the supported names defined in the PromptNames
            literal type, returns a cached runnable object. Otherwise,
            looks up in the prompt_library dictionary if there is
            a prompt with that runnable_name, and returns a runnable
            object for a chat with that prompt.
        user_settings: Optional settings to override the default
            configuration. Can be either:
            - dict[str, str]: Dictionary with 'model' key
            - LanguageModelSettings: Pydantic model instance
            - None: Use settings from config.toml or defaults
        system_prompt: System prompt used in messages with the
            language model.

    Returns:
        A KernelType object RunnableSerializable[dict[str, str], str],
            A Langchain runnable chain that combines a prompt template,
            language model, and string output parser. The chain accepts a
            dictionary of template variables and returns a string
            response.

    Raises:
        ValueError: If runnable_name is not supported or if user_settings
            contains invalid model source names. No check is made at this
            stage that the model names are correct (as they frequently
            change); instead, failure occurs when the .invoke member function
            is called.
        ValidationError, TypeError: alternative errors raised in the same
            circumstances as above.
        ImportError: for not installed libraries.

    Examples:
        Create runnable with default settings read from configuration
        file:
        ```python
        try:
            runnable = create_runnable("query")
        except Exception ...
        ```

        Override with dictionary:
        ```python
        try:
            runnable = create_runnable("summarizer",
                {"model": "OpenAI/gpt-4o"})
        except Exception ...
        ```

        Override with settings object:
        ```python
        from lmm.config.config import LanguageModelSettings
        settings = LanguageModelSettings(
            model="Mistral/mistral-small-latest"
        )
        try:
            runnable = create_runnable("question_generator", settings)
        except Exception ...
        ```

        The runnable object may be used with Langchain `.invoke` syntax:

        ```python
        try:
            response = runnable.invoke(
                {'text': "Logistic regression is used when the outcome"
                    + " variable is binary."}
            )
        except Exception ...
        ```
    """

    def _create_or_get(
        settings: LanguageModelSettings,
        runnable_name: PromptNames,
        system_prompt: str | None,
        **kwargs: MetadataPrimitiveWithList,
    ) -> RunnableType:
        model: RunnableDefinition = RunnableDefinition(
            runnable_name=runnable_name,
            settings=settings,
            system_prompt_override=system_prompt,
            params=_dict_to_runnable_par(kwargs),
        )
        return runnable_library[model]

    settings: Settings
    match user_settings:
        case dict() if bool(user_settings):
            try:
                settings = Settings(
                    major=user_settings,  # type: ignore
                    minor=user_settings,  # type: ignore
                    aux=user_settings,  # type: ignore
                )
            except Exception as e:
                raise ValueError(f"Invalid model definition:\n{e}")
        case LanguageModelSettings():
            settings = Settings(
                major=user_settings,
                minor=user_settings,
                aux=user_settings,
            )
        case Settings():
            settings = user_settings
        case None | {}:
            settings = Settings()
        case _:
            raise ValueError(
                f"Invalid model definition: {user_settings}"
            )

    # Logic to retrieve or create prompt definition to check model tier
    prompt_definition: PromptDefinition
    params_dict = dict(kwargs)

    # We use create_prompt_definition if we have params (to handle validation)
    # or prompt_library if we don't (to handle custom prompts).
    if params_dict:
        try:
            prompt_definition = create_prompt_definition(runnable_name, **params_dict) # type: ignore
        except ValueError:
            # Fallback: if create_prompt_definition fails (e.g. custom prompt), 
            # check library. But strictly custom prompts shouldn't have params 
            # in current architecture unless handled by create_prompt_definition.
            # raising the error is appropriate if params were intended for a 
            # prompt that doesn't support them.
            raise
    else:
        try:
            prompt_definition = prompt_library[runnable_name] # type: ignore
        except Exception as e:
            raise ValueError(f"{runnable_name} is not a valid runnable name ({e})")

    match prompt_definition.model_tier:
        case 'major':
            settings_to_use = settings.major
        case 'minor':
            settings_to_use = settings.minor
        case 'aux':
            settings_to_use = settings.aux
        case _:
            settings_to_use = settings.minor

    return _create_or_get(
        settings_to_use, 
        runnable_name, # type: ignore
        system_prompt,
        **kwargs
    )

Message Iterator

Message Iterator Module

This module provides functionality to create iterators that generate sequential messages. Used to feed message through a fake language model.

ConstantMessageIterator

An iterator that generates the same message with which is was intialized.

The iterator is infinite and will continue generating messages indefinitely.

Source code in lmm/models/message_iterator.py
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
class ConstantMessageIterator:
    """
    An iterator that generates the same message with which is was intialized.

    The iterator is infinite and will continue generating messages
    indefinitely.
    """

    def __init__(self, message: str = "Message") -> None:
        """
        Initialize the MessageIterator.

        Args:
            message: The message to return in generated messages. Defaults to
                "Message".
        """
        self.message = message
        self.counter = 0

    def __iter__(self) -> Iterator[str]:
        """Return the iterator object itself."""
        return self

    def __next__(self) -> str:
        """
        Generate the next message in the sequence.

        Returns:
            The string message"
        """
        self.counter += 1
        return self.message

__init__(message='Message')

Initialize the MessageIterator.

Parameters:

Name Type Description Default
message str

The message to return in generated messages. Defaults to "Message".

'Message'
Source code in lmm/models/message_iterator.py
86
87
88
89
90
91
92
93
94
95
def __init__(self, message: str = "Message") -> None:
    """
    Initialize the MessageIterator.

    Args:
        message: The message to return in generated messages. Defaults to
            "Message".
    """
    self.message = message
    self.counter = 0

__iter__()

Return the iterator object itself.

Source code in lmm/models/message_iterator.py
97
98
99
def __iter__(self) -> Iterator[str]:
    """Return the iterator object itself."""
    return self

__next__()

Generate the next message in the sequence.

Returns:

Type Description
str

The string message"

Source code in lmm/models/message_iterator.py
101
102
103
104
105
106
107
108
109
def __next__(self) -> str:
    """
    Generate the next message in the sequence.

    Returns:
        The string message"
    """
    self.counter += 1
    return self.message

MessageIterator

An iterator that generates sequential messages with a customizable prefix.

The iterator is infinite and will continue generating messages indefinitely. Messages follow the pattern: "{prefix} {counter}" where counter starts at 1.

Source code in lmm/models/message_iterator.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
class MessageIterator:
    """
    An iterator that generates sequential messages with a customizable prefix.

    The iterator is infinite and will continue generating messages
    indefinitely. Messages follow the pattern: "{prefix} {counter}" where
    counter starts at 1.
    """

    def __init__(self, prefix: str = "Message") -> None:
        """
        Initialize the MessageIterator.

        Args:
            prefix: The prefix to use for generated messages. Defaults to
                "Message".
        """
        self.prefix = prefix
        self.counter = 1

    def __iter__(self) -> Iterator[str]:
        """Return the iterator object itself."""
        return self

    def __next__(self) -> str:
        """
        Generate the next message in the sequence.

        Returns:
            A string in the format "{prefix} {counter}"
        """
        message = f"{self.prefix} {self.counter}"
        self.counter += 1
        return message

__init__(prefix='Message')

Initialize the MessageIterator.

Parameters:

Name Type Description Default
prefix str

The prefix to use for generated messages. Defaults to "Message".

'Message'
Source code in lmm/models/message_iterator.py
20
21
22
23
24
25
26
27
28
29
def __init__(self, prefix: str = "Message") -> None:
    """
    Initialize the MessageIterator.

    Args:
        prefix: The prefix to use for generated messages. Defaults to
            "Message".
    """
    self.prefix = prefix
    self.counter = 1

__iter__()

Return the iterator object itself.

Source code in lmm/models/message_iterator.py
31
32
33
def __iter__(self) -> Iterator[str]:
    """Return the iterator object itself."""
    return self

__next__()

Generate the next message in the sequence.

Returns:

Type Description
str

A string in the format "{prefix} {counter}"

Source code in lmm/models/message_iterator.py
35
36
37
38
39
40
41
42
43
44
def __next__(self) -> str:
    """
    Generate the next message in the sequence.

    Returns:
        A string in the format "{prefix} {counter}"
    """
    message = f"{self.prefix} {self.counter}"
    self.counter += 1
    return message

yield_constant_message(message='Message')

Create and return a ConstantMessageIterator instance.

This function creates an iterator that generates the same message repeatedly. The iterator is infinite and will continue generating the message indefinitely.

Parameters:

Name Type Description Default
message str

The message to generate. Defaults to "Message".

'Message'

Returns:

Type Description
ConstantMessageIterator

A ConstantMessageIterator instance.

Example

iterator = constant_message() next(iterator) 'Message' next(iterator) 'Message'

custom_iterator = constant_message("Alert") next(custom_iterator) 'Alert' next(custom_iterator) 'Alert'

Source code in lmm/models/message_iterator.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
def yield_constant_message(
    message: str = "Message",
) -> ConstantMessageIterator:
    """
    Create and return a ConstantMessageIterator instance.

    This function creates an iterator that generates the same
    message repeatedly. The iterator is infinite and will continue
    generating the message indefinitely.

    Args:
        message: The message to generate. Defaults to "Message".

    Returns:
        A ConstantMessageIterator instance.

    Example:
        >>> iterator = constant_message()
        >>> next(iterator)
        'Message'
        >>> next(iterator)
        'Message'

        >>> custom_iterator = constant_message("Alert")
        >>> next(custom_iterator)
        'Alert'
        >>> next(custom_iterator)
        'Alert'
    """
    return ConstantMessageIterator(message)

yield_message(prefix='Message')

Create and return a MessageIterator instance.

This function creates an iterator that generates sequential messages with the specified prefix. The iterator is infinite and will continue generating messages indefinitely.

Parameters:

Name Type Description Default
prefix str

The prefix to use for generated messages. Defaults to "Message"

'Message'

Returns:

Type Description
MessageIterator

A MessageIterator instance that generates messages like:

MessageIterator

"Message 1", "Message 2", "Message 3", etc.

Example

iterator = yield_message() next(iterator) 'Message 1' next(iterator) 'Message 2'

custom_iterator = yield_message("Alert") next(custom_iterator) 'Alert 1' next(custom_iterator) 'Alert 2'

Source code in lmm/models/message_iterator.py
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def yield_message(prefix: str = "Message") -> MessageIterator:
    """
    Create and return a MessageIterator instance.

    This function creates an iterator that generates sequential messages
    with the specified prefix. The iterator is infinite and will continue
    generating messages indefinitely.

    Args:
        prefix: The prefix to use for generated messages. Defaults to "Message"

    Returns:
        A MessageIterator instance that generates messages like:
        "Message 1", "Message 2", "Message 3", etc.

    Example:
        >>> iterator = yield_message()
        >>> next(iterator)
        'Message 1'
        >>> next(iterator)
        'Message 2'

        >>> custom_iterator = yield_message("Alert")
        >>> next(custom_iterator)
        'Alert 1'
        >>> next(custom_iterator)
        'Alert 2'
    """
    return MessageIterator(prefix)