YClient requires an OpenAI compatible LLM model to run. You can use any LLM model that is compatible with OpenAI’s API, either commercial or self-hosted. Here we will briefly describe how to set up a local LLMs server using ollama.
First, you need to install ollama
on your local machine. Download the latest release from the official website and follow the installation instructions.
Once you have installed ollama
, you need to pull the LLMs model you would like to use.
You can find a list of available models on the ollama models page.
To pull a model, use the following command:
ollama pull <model_name>
For example, to pull the llama3
model, you would run:
ollama pull llama3
To start the LLMs server, use the following command:
ollama start serve
This will start the LLMs server on your local machine. You can now use the server to interact with the LLMs model.
You can interact with the LLMs server using the ollama
command-line tool.
ollama run llama3
This will start an interactive session with the llama3
model. You can type text and the model will generate a response.
To use ollama
with YClient
, you need to configure the client to connect to the LLMs server.
You can do this by editing the config.json
(see Scenario Design) specifying as LLM server URL http://127.0.0.1:11434/v1 and as selected model llama3
(or any other model, or list of models, you have previously installed).
NB: Ollama is just one of the many options available to run LLMs locally. You can use any other LLMs server that is compatible with OpenAI’s API (e.g., LM Studio, llama-cpp-python).