Overview

Request 1128890 accepted

- update to 0.13.5:
* [Component] share loading entry for component and module
(#2945)
* Initial support for the component model proposal.
* This PR allows WasmEdge to recognize the component and module
format.
* Provide options for enabling OpenBLAS, Metal, and cuBLAS.
* Bump llama.cpp to b1383
* Build thirdparty/ggml only when the ggml backend is enabled.
* Enable the ggml plugin on the macOS platform.
* Introduce `AUTO` detection. Wasm application will no longer
need to specify the hardware spec (e.g., CPU or GPU). It will
auto-detect by the runtime.
* Unified the preload options with case-insensitive matching
* Introduce `metadata` for setting the ggml options.
* The following options are supported:
* `enable-log`: `true` to enable logging. (default: `false`)
* `stream-stdout`: `true` to print the inferred tokens in the
streaming mode to standard output. (default: `false`)
* `ctx-size`: Set the context size the same as the `--ctx-size`
parameter in llama.cpp. (default: `512`)
* `n-predict`: Set the number of tokens to predict, the same as
the `--n-predict` parameter in llama.cpp. (default: `512`)
* `n-gpu-layers`: Set the number of layers to store in VRAM,
the same as the `--n-gpu-layers` parameter in llama.cpp.
(default: `0`)
* `reverse-prompt`: Set the token pattern at which you want to
halt the generation. Similar to the `--reverse-prompt`
parameter in llama.cpp. (default: `""`)
* `batch-size`: Set the number of batch sizes for prompt

Loading...

Request History
Dirk Mueller's avatar

dirkmueller created request

- update to 0.13.5:
* [Component] share loading entry for component and module
(#2945)
* Initial support for the component model proposal.
* This PR allows WasmEdge to recognize the component and module
format.
* Provide options for enabling OpenBLAS, Metal, and cuBLAS.
* Bump llama.cpp to b1383
* Build thirdparty/ggml only when the ggml backend is enabled.
* Enable the ggml plugin on the macOS platform.
* Introduce `AUTO` detection. Wasm application will no longer
need to specify the hardware spec (e.g., CPU or GPU). It will
auto-detect by the runtime.
* Unified the preload options with case-insensitive matching
* Introduce `metadata` for setting the ggml options.
* The following options are supported:
* `enable-log`: `true` to enable logging. (default: `false`)
* `stream-stdout`: `true` to print the inferred tokens in the
streaming mode to standard output. (default: `false`)
* `ctx-size`: Set the context size the same as the `--ctx-size`
parameter in llama.cpp. (default: `512`)
* `n-predict`: Set the number of tokens to predict, the same as
the `--n-predict` parameter in llama.cpp. (default: `512`)
* `n-gpu-layers`: Set the number of layers to store in VRAM,
the same as the `--n-gpu-layers` parameter in llama.cpp.
(default: `0`)
* `reverse-prompt`: Set the token pattern at which you want to
halt the generation. Similar to the `--reverse-prompt`
parameter in llama.cpp. (default: `""`)
* `batch-size`: Set the number of batch sizes for prompt


Alexandre Vicenzi's avatar

avicenzi accepted request

openSUSE Build Service is sponsored by