01 Mar 14:33

YellowRoseCx

11b35bd

KoboldCPP-v1.85.yr0-ROCm Latest

Latest

ROCm backend changes

This release will have 2 build files for you to try if one doesn't work for you, the only difference is in the GPU kernel files that are included
- koboldcpp_rocm.exe will have been built with files more similar to how v1.79.yr1-ROCm was compiled
- koboldcpp_rocm_b2.exe will have been built with the same files as the previous version
Support has been added for experimental HIPGraph usage. (Disabled by default, no performance increase yet.)
HIP virtual memory management added, but is disabled until upstream fixes are created. ggml-org#11405

koboldcpp-rocm-1.85

Now with 5% more kobo edition

New Features:
- NEW: Added Server-Sided (networked) save slots! You can now specify a database file when launching KoboldCpp using --savedatafile. Then, you will be able to save and load persistent stories over the network to that KoboldCpp server, and access it from any other browser or device connected to it over the network. This can also be combined with --password to require an API key to save/load the stories.
- Added the ability to switch models, settings and configs at runtime! This also allows for remote model swapping. Credits to @esolithe for original reference implementation.
  - Launch with --admin to enable this feature, and also provide --admindir containing .kcpps launch configs.
  - Optionally, provide --adminpassword to secure admin functions
  - You will be able to swap between any model's config at runtime from the Admin panel in Lite. You can prepare .kcpps configs for different layers, backends, models, etc.
  - KoboldCpp will then terminate the current instance and relaunch to a new config.
- Added Top-N Sigma sampler (credit @EquinoxPsychosis). Note that this sampler can only be combined with Top-K, Temperature, and XTC.
- Added --exportconfig, allowing users to export any set of launch arguments as a .kcpps config file from the command line. This file can also be used subsequently for model switching in admin mode.
- Minor refactors for TFS and rep pen by @Reithan
- CLIP vision embeddings can now be reused between multiple requests, so they won't have to be reprocessed if the images don't change.
- Context shifting disabled when using mrope (used in Qwen2VL) as it does not work correctly.
- Now defaults to AutoGuess for chat completions adapter. Set to "Alpaca" for the old behavior instead.
- You can now set the maximum resolution accepted by vision mmprojs with --visionmaxres. Images larger than that will be downscaled before processing.
- You can now set a length limit for TTS, using --ttsmaxlen when launching, this limits the number of TTS tokens allowed to be generated (range 512 to 4096). Each 1s of audio is about 75 tokens.
- Added support for using aria2c and wget for model downloading if detected on system. (credits @henk717).
- It's also now possible to specify multiple URLs when loading multipart models online with --model [url1] [url2]... (CLI only), which will allow KoboldCpp to download multiple model file URLs.
- Added automatic recovery in admin mode if it fails when switching to a faulty config, it will attempt to rollback to the original known-good config.
- Added cloudflared tunnel download for aarch64 (thanks @FlippFuzz). Also, allowed SSL combined with remote tunnels.
Kobold Lite
- NEW: Added deepseek instruct template, and added support for reasoning/thinking template tags. You can configure thinking rendering behavior from Context > Tokens > Thinking
- NEW: Finally allows specifying individual start and end instruct tags instead of combining them. Toggle this in Settings > Toggle End Tags.
- NEW: Multi-pass websearch added. This allows you to specify a template that is used to generate the search query.
- Added improved thinking support, display and allow forced injecting <think> tokens in AI replies or filtering out old thoughts in subsequent generations.
- Reworked and improved load/save UI, added 2 extra local slots and 8 extra remote save slots.
- Top-N sigma support
- Added customization options for assistant jailbreak prompt
- Refactored 3rd party scenario loader (thanks @Desaroll)
- Fix websearch button visibility
- Improved instruct formatting in classic UI
- Fixed some LaTeX and markdown edge cases
- Upped max length slider to 1024 if detected context is larger than 4096.
- Added a websearch toggle button
- TTS now allows downloading the audio output as a file when testing it, instead of just playing the sound.
- Some regex parsing fixes
- Added admin panel
- Multiple other fixes and improvements
Fixes:
- Merged fixes and improvements from upstream
- Fixed .kcppt templates backend override not working
- Updated clinfo binary for windows.
- Fixed MoE experts override not working for Deepseek
- Fixed multiple loader bugs when using the AutoGuess adapter.
- Fixed images failing to generate when using the AutoGuess adapter.
- Removed TTS caching as it was not very good.
- Fixed a bug with TTS that could cause a crash.

Assets 6

5 Join discussion

18 Feb 23:33

github-actions

v1.83.1.yr1-ROCm

30e4827

KoboldCPP-v1.83.1.yr1-ROCm

p. sure I figured it out CMake Flag Fix

Assets 4

28 Jan 10:50

github-actions

v1.82.4.yr0-ROCm

91ec876

KoboldCPP-v1.82.4.yr0-ROCm

Apparently there's starting to be trouble again with the "unofficially supported" ROCm GPUs..I'm trying to look into it when I'm at home and able to. If the regular koboldcpp_rocm.exe doesn't work for you, try the rocm-5.7 version please

Assets 5

21 Jan 02:55

YellowRoseCx

v1.82.1.yr0-ROCm

7bcb7cb

KoboldCPP-v1.82.1.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 4

13 Join discussion

20 Jan 07:00

YellowRoseCx

v1.82.yr0-ROCm

2b5cf34

KoboldCPP-v1.82.yr0-ROCm Pre-release

Pre-release

Merge remote-tracking branch 'upstream/concedo'

Assets 6

0 Join discussion

10 Jan 15:30

github-actions

v1.81.1.yr0-ROCm

4f071ee

KoboldCPP-v1.81.1.yr0-ROCm Pre-release

Pre-release

Merge remote-tracking branch 'upstream/concedo'

Assets 4

30 Dec 07:47

github-actions

v1.80.3.yr0-ROCm

073fdee

KoboldCPP-v1.80.3.yr0-ROCm

Update cmake-rocm-windows.yml

Assets 4

05 Dec 03:43

github-actions

v1.79.1.yr1-ROCm

d6949d6

KoboldCPP-v1.79.1.yr1-ROCm

attempt 6700xt fix for cmake-rocm-windows.yml

Assets 4

03 Dec 11:02

github-actions

v1.79.1.yr0-ROCm

d6949d6

KoboldCPP-v1.79.1.yr0-ROCm

attempt 6700xt fix for cmake-rocm-windows.yml

Assets 4

19 Nov 06:36

github-actions

v1.78.yr0-ROCm

f8403c7

KoboldCPP-v1.78.yr0-ROCm

koboldcpp-rocm-1.78

NEW: Added support for Flux and Stable Diffusion 3.5 models: Image generation has been updated with new arch support (thanks to stable-diffusion.cpp) with additional enhancements. You can use either fp16 or fp8 safetensor models, or the GGUF models. Supports all-in-one models (bundled T5XXL, Clip-L/G, VAE) or loading them individually.
- Grab an all-in-one flux model here: https://huggingface.co/Comfy-Org/flux1-dev/blob/main/flux1-dev-fp8.safetensors
- Alternatively, we have a ready to use .kcppt template that will setup and download everything you need here: https://huggingface.co/koboldcpp/kcppt/resolve/main/Flux1-Dev.kcppt
- Large image handling is also more consistent with VAE tiling, 1024x1024 should work nicely for SDXL and Flux.
- You can specify the new image gen components by loading them with --sdt5xxl, --sdclipl and --sdclipg (for SD3.5), they work with URL resources as well.
- Note: FP16 Flux needs over 20GB of VRAM to work. If you have less VRAM, you should use the quantized GGUFs, or select Compress Weights when loading the Flux model. SD3.5 medium is more forgiving.
- As before, it can be used with the bundled StableUI at http://localhost:5001/sdui/
Debug mode prints penalties for XTC
Added a new flag --nofastforward, this forces full prompt reprocessing on every request. It can potentially give more repeatable/reliable/consistent results in some cases.
CLBlast support is still retained, but has been further downgraded to "compatibility mode" and is no longer recommended (use Vulkan instead). CLBlast GPU offload must now maintain duplicate a copy of the layers in RAM as well, as it now piggybacks off the CPU backend.
Added common identity provider /.well-known/serviceinfo Haidra-Org/AI-Horde#466 PygmalionAI/aphrodite-engine#807 theroyallab/tabbyAPI#232
Reverted some changes that reduced speed in HIPBLAS.
Fixed a bug where bad logprobs JSON was output when logits were -Infinity
Updated Kobold Lite, multiple fixes and improvements
- Added support for custom CSS styles
- Added support for generating larger images (select BigSquare in image gen settings)
- Fixed some streaming issues when connecting to Tabby backend
- Better world info length limiting (capped at 50% of max context before appending to memory)
- Added support for Clip Skip for local image generation.
Merged fixes and improvements from upstream

To use, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller.
If you're using Linux, clone the repo and build in terminal with make LLAMA_HIPBLAS=1 -j

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.
Release notes from: https://github.com/LostRuins/koboldcpp/releases/tag/v1.78

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROCm backend changes

koboldcpp-rocm-1.85

New Features:

Kobold Lite

Fixes:

koboldcpp-rocm-1.78

Releases: YellowRoseCx/koboldcpp-rocm

KoboldCPP-v1.85.yr0-ROCm

ROCm backend changes

koboldcpp-rocm-1.85

New Features:

Kobold Lite

Fixes:

KoboldCPP-v1.83.1.yr1-ROCm

KoboldCPP-v1.82.4.yr0-ROCm

KoboldCPP-v1.82.1.yr0-ROCm

KoboldCPP-v1.82.yr0-ROCm

KoboldCPP-v1.81.1.yr0-ROCm

KoboldCPP-v1.80.3.yr0-ROCm

KoboldCPP-v1.79.1.yr1-ROCm

KoboldCPP-v1.79.1.yr0-ROCm

KoboldCPP-v1.78.yr0-ROCm

koboldcpp-rocm-1.78