![]() v11.1.3476 (build: Jan 12 2026) |
|
LLM-serverSome BOSS-Offline reports use generative AI based on the LLM neural network, so to use them, you need to configure the settings at this page.You can configure either a local server or a cloud server, or both at the same time. If both at the same time are configured, the local server will take priority, except when neutral data (that does not contain confidential or personal information) is being transmitted. For a local server Ollama framework is supported, and ChatGPT / YandexGPT for a cloud server. Server URL specify http or https URL of the server with Ollama installed As usually, this is http 11434 Example: http://192.168.0.111:11434 API-key ChatGPT: you should create API-key and copy it here. YandexGPT: you should create billing account here, and then obtain OAuth-token and copy it here. Model Ollama: specify the loaded model to use, currently, models from qwen3 or deepseek-r1 are recommended. For example: deepseek-r1:14b deepseek-r1:32b qwen3:14b qwen3:32b You need to specify the exact model that downloaded and installed in Ollama. Complete models list available on the Ollama website. ChatGPT: gpt-4o o4-mini gpt-4.1 gpt-4.1-mini gpt-5 gpt-5-mini and others YandexGPT: gpt://<folder_ID>/yandexgpt gpt://<folder_ID>/yandexgpt/latest gpt://<folder_ID>/yandexgpt-lite Ollama: - using a GPU with CUDA support is not required for operation, but is highly recommended, because the performance will be an order of magnitude higher even in comparison with multi-core CPU servers! - the model must fit completely into the video memory or RAM; - the larger the model, the better the quality, but the slower the speed; - it is allowed to use several GPUs (if the video memory of one GPU is not enough to accommodate the entire model); - when using GPU, CPU and RAM resources can be minimal (for example, 2 CPUs and 4 GB RAM are quite enough). Example of installing Ollama on Linux Ubuntu (it is assumed that the GPU drivers are already installed): curl -fsSL https://ollama.com/install.sh | shFor non-localhost access and increasing the allowed model loading time, it is recommended to make additional settings: sudo nano /etc/systemd/system/ollama.serviceThe following lines should be added to the [Service] section: Environment="OLLAMA_HOST=0.0.0.0" Environment="OLLAMA_LOAD_TIMEOUT=60m"Then save the file and execute: sudo systemctl daemon-reload sudo systemctl restart ollamaThen you need to download and install the model. For example, qwen3:32b ollama run qwen3:32b |
|
| © KICKIDLER DLP | |