Instructions to use daichira/test-lora-repo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use daichira/test-lora-repo with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "daichira/test-lora-repo") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use daichira/test-lora-repo with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for daichira/test-lora-repo to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for daichira/test-lora-repo to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for daichira/test-lora-repo to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="daichira/test-lora-repo", max_seq_length=2048, )
lora_structeval_t_qwen3_4b_0118
This repository provides a LoRA adapter fine-tuned from unsloth/Qwen3-4B-Instruct-2507 using Unsloth (QLoRA, 4-bit base).
- Contents: LoRA adapter weights (PEFT) + tokenizer files (if present)
- Does not include: Base model weights, training dataset files
Training Objective
This adapter was trained for structured output quality (format conversion / structured serialization) while avoiding learning verbose chain-of-thought.
Loss design
- The model sees the full conversation context (system + user + assistant).
- Loss is applied only to the final assistant turn ("assistant-only loss").
- Additionally, when an Output marker is present, loss is applied only to:
OUTPUT_LEARN_MODE="after_marker"- Markers searched:
Output:, OUTPUT:, Final:, Answer:, Result:, Response: - With
MASK_COT=Truethis typically means learning the content afterOutput:(suppressing CoT-style "Approach:" text).
This setup is intended to improve final answer correctness and formatting without encouraging the model to emit chain-of-thought.
Training Configuration (Key)
- Run stamp (UTC):
2026-01-18_062458Z - Base model:
unsloth/Qwen3-4B-Instruct-2507 - Dataset:
u-10bei/structured_data_with_cot_dataset_512_v2 - Method: QLoRA (4-bit base) + LoRA adapters (PEFT)
- Max sequence length:
512 - Seed:
3407 - Train/Val split: val_ratio=0.05
Hyperparameters
- Epochs:
2 - LR:
0.0001 - Warmup ratio:
0.1 - Weight decay:
0.05 - Per-device train batch:
2 - Gradient accumulation:
8(effective batch ≈16) - LR scheduler: cosine
- Precision: fp16 (T4-friendly)
LoRA
- r:
64 - alpha:
128 - dropout:
0.0 - target_modules:
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Prompt / Output Style (Dataset-aligned)
The training dataset uses chat messages and often includes a short reasoning header followed by a final structured output. With the default masking setup, the adapter is optimized primarily for the final structured segment.
Typical assistant response shape:
Approach:(may be present, but often masked from loss)Output:(structured data begins here; primary training target)
You can encourage concise responses by explicitly requesting: "Return ONLY the final structured output."
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_id = "unsloth/Qwen3-4B-Instruct-2507"
adapter_id = "daichira/lora_structeval_t_qwen3_4b_0118"
tokenizer = AutoTokenizer.from_pretrained(base_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
base_id,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
# Example: run generation with your preferred chat template usage.
Limitations / Notes
This is a LoRA adapter, not a standalone model. You must load
unsloth/Qwen3-4B-Instruct-2507separately.Format correctness depends on your decoding settings and prompt discipline. For strict tasks, consider:
temperature=0(or low),top_p=1.0- Post-validation (JSON/YAML/TOML/XML parsers) where applicable
The adapter is specialized for structured serialization/format conversion; it may not improve general chat ability.
Sources & Terms (IMPORTANT)
- Training dataset:
u-10bei/structured_data_with_cot_dataset_512_v2(referenced on Hugging Face Hub) - This repository contains LoRA adapter weights only and does not redistribute the training dataset.
- You are responsible for complying with:
- The dataset license/terms as stated in the dataset repository.
- The base model license/terms for
unsloth/Qwen3-4B-Instruct-2507(these apply to derivatives/adapters as well).
License
- Adapter repo license field:
other(model card metadata) - Important: Base model terms for
unsloth/Qwen3-4B-Instruct-2507apply. Dataset terms foru-10bei/structured_data_with_cot_dataset_512_v2apply.
- Downloads last month
- -
Model tree for daichira/test-lora-repo
Base model
Qwen/Qwen3-4B-Instruct-2507