Instructions to use cstr/Spaetzle-v8-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cstr/Spaetzle-v8-7b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cstr/Spaetzle-v8-7b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("cstr/Spaetzle-v8-7b") model = AutoModelForCausalLM.from_pretrained("cstr/Spaetzle-v8-7b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use cstr/Spaetzle-v8-7b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cstr/Spaetzle-v8-7b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cstr/Spaetzle-v8-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/cstr/Spaetzle-v8-7b
- SGLang
How to use cstr/Spaetzle-v8-7b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cstr/Spaetzle-v8-7b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cstr/Spaetzle-v8-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cstr/Spaetzle-v8-7b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cstr/Spaetzle-v8-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use cstr/Spaetzle-v8-7b with Docker Model Runner:
docker model run hf.co/cstr/Spaetzle-v8-7b
Spaetzle-v8-7b
This model is supposed to show adequate performance in German and English on a number of tasks, while mostly behaving well, that is, without rambling on, intermixing tokens from different templates in training and adapting, etc.
It is mostly a quick test, and considerably weaker in German grammar and orthography than DiscoLM e.g., but for use cases where this is not too important, but e.g. instruction following, reasoning, etc, it might actually be a little bit preferable.
It is a merge of the following models using LazyMergekit:
- flemmingmiguel/NeuDist-Ro-7B
- johannhartmann/Brezn3
- ResplendentAI/Flora_DPO_7B
- on the basis of mayflowergmbh/Wiedervereinigung-7b-dpo-laser
All credits are due to the creators of those original models and the training datasets involved.
For a suitable quantized version, try cstr/Spaetzle-v8-7b-GGUF
Evaluation
Open LLM Leaderboard Evaluation Results Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 72.27 |
| AI2 Reasoning Challenge (25-Shot) | 68.69 |
| HellaSwag (10-Shot) | 86.68 |
| MMLU (5-Shot) | 64.60 |
| TruthfulQA (0-shot) | 64.05 |
| Winogrande (5-shot) | 81.45 |
| GSM8k (5-shot) | 68.16 |
EQ-Bench (v2_de): 61.04 / english (v2): 78.3
ScandEval 12.5.2 scores
| Benchmark | Spaetzle-v8-7b Value |
|---|---|
| Model ID | cstr/Spaetzle-v8-7b (few-shot, val) |
| Parameters | 7242 |
| Vocabulary Size | 32 |
| Context | 32768 |
| Commercial | False |
| Speed | 5,980 Β± 1,031 / 1,714 Β± 552 |
| Rank | 1.85 |
| GermEval | 58.90 Β± 2.30 / 45.55 Β± 3.30 |
| SB10k | 61.34 Β± 1.90 / 72.98 Β± 1.30 |
| ScaLA-De | 31.58 Β± 4.39 / 65.51 Β± 2.23 |
| GermanQuAD | 24.91 Β± 3.98 / 60.88 Β± 3.31 |
| MLSum | 67.25 Β± 1.06 / 22.95 Β± 2.64 |
| MMLU-De | 34.62 Β± 2.20 / 50.43 Β± 1.52 |
| HellaSwag-De | 48.70 Β± 2.47 / 61.05 Β± 1.79 |
| Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
|---|---|---|---|---|---|
| Spaetzle-v8-7b | 45.31 | 75.69 | 63.94 | 45.57 | 57.63 |
AGIEval
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| agieval_aqua_rat | 0 | acc | 25.59 | Β± | 2.74 |
| acc_norm | 24.80 | Β± | 2.72 | ||
| agieval_logiqa_en | 0 | acc | 39.63 | Β± | 1.92 |
| acc_norm | 39.78 | Β± | 1.92 | ||
| agieval_lsat_ar | 0 | acc | 23.48 | Β± | 2.80 |
| acc_norm | 24.35 | Β± | 2.84 | ||
| agieval_lsat_lr | 0 | acc | 50.98 | Β± | 2.22 |
| acc_norm | 51.96 | Β± | 2.21 | ||
| agieval_lsat_rc | 0 | acc | 62.08 | Β± | 2.96 |
| acc_norm | 62.83 | Β± | 2.95 | ||
| agieval_sat_en | 0 | acc | 78.64 | Β± | 2.86 |
| acc_norm | 79.13 | Β± | 2.84 | ||
| agieval_sat_en_without_passage | 0 | acc | 44.66 | Β± | 3.47 |
| acc_norm | 44.66 | Β± | 3.47 | ||
| agieval_sat_math | 0 | acc | 37.27 | Β± | 3.27 |
| acc_norm | 35.00 | Β± | 3.22 |
Average: 45.31%
GPT4All
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| arc_challenge | 0 | acc | 63.14 | Β± | 1.41 |
| acc_norm | 64.51 | Β± | 1.40 | ||
| arc_easy | 0 | acc | 85.98 | Β± | 0.71 |
| acc_norm | 82.49 | Β± | 0.78 | ||
| boolq | 1 | acc | 88.10 | Β± | 0.57 |
| hellaswag | 0 | acc | 66.31 | Β± | 0.47 |
| acc_norm | 85.17 | Β± | 0.35 | ||
| openbookqa | 0 | acc | 38.00 | Β± | 2.17 |
| acc_norm | 47.20 | Β± | 2.23 | ||
| piqa | 0 | acc | 83.35 | Β± | 0.87 |
| acc_norm | 84.17 | Β± | 0.85 | ||
| winogrande | 0 | acc | 78.22 | Β± | 1.16 |
Average: 75.69%
TruthfulQA
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| truthfulqa_mc | 1 | mc1 | 47.74 | Β± | 1.75 |
| mc2 | 63.94 | Β± | 1.53 |
Average: 63.94%
Bigbench
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| bigbench_causal_judgement | 0 | multiple_choice_grade | 56.84 | Β± | 3.60 |
| bigbench_date_understanding | 0 | multiple_choice_grade | 66.12 | Β± | 2.47 |
| bigbench_disambiguation_qa | 0 | multiple_choice_grade | 41.47 | Β± | 3.07 |
| bigbench_geometric_shapes | 0 | multiple_choice_grade | 22.01 | Β± | 2.19 |
| exact_str_match | 0.00 | Β± | 0.00 | ||
| bigbench_logical_deduction_five_objects | 0 | multiple_choice_grade | 31.40 | Β± | 2.08 |
| bigbench_logical_deduction_seven_objects | 0 | multiple_choice_grade | 23.14 | Β± | 1.60 |
| bigbench_logical_deduction_three_objects | 0 | multiple_choice_grade | 56.00 | Β± | 2.87 |
| bigbench_movie_recommendation | 0 | multiple_choice_grade | 45.00 | Β± | 2.23 |
| bigbench_navigate | 0 | multiple_choice_grade | 50.70 | Β± | 1.58 |
| bigbench_reasoning_about_colored_objects | 0 | multiple_choice_grade | 70.05 | Β± | 1.02 |
| bigbench_ruin_names | 0 | multiple_choice_grade | 45.54 | Β± | 2.36 |
| bigbench_salient_translation_error_detection | 0 | multiple_choice_grade | 26.05 | Β± | 1.39 |
| bigbench_snarks | 0 | multiple_choice_grade | 71.82 | Β± | 3.35 |
| bigbench_sports_understanding | 0 | multiple_choice_grade | 72.92 | Β± | 1.42 |
| bigbench_temporal_sequences | 0 | multiple_choice_grade | 44.20 | Β± | 1.57 |
| bigbench_tracking_shuffled_objects_five_objects | 0 | multiple_choice_grade | 22.80 | Β± | 1.19 |
| bigbench_tracking_shuffled_objects_seven_objects | 0 | multiple_choice_grade | 18.23 | Β± | 0.92 |
| bigbench_tracking_shuffled_objects_three_objects | 0 | multiple_choice_grade | 56.00 | Β± | 2.87 |
Average: 45.57%
Average score: 57.63%
π» Usage
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "cstr/Spaetzle-v8-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
π§© Configuration
The model uses ChatML and should work well with this (as it is merged from models which (mostly) saw ChatML templates in training).
models:
- model: mayflowergmbh/Wiedervereinigung-7b-dpo-laser
# no parameters necessary for base model
- model: flemmingmiguel/NeuDist-Ro-7B
parameters:
density: 0.60
weight: 0.30
- model: johannhartmann/Brezn3
parameters:
density: 0.65
weight: 0.40
- model: ResplendentAI/Flora_DPO_7B
parameters:
density: 0.6
weight: 0.3
merge_method: dare_ties
base_model: mayflowergmbh/Wiedervereinigung-7b-dpo-laser
parameters:
int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base
- Downloads last month
- 83