Instructions to use girishmanure/Qwen2.5-1.5B-SQL-Assistant with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use girishmanure/Qwen2.5-1.5B-SQL-Assistant with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct") model = PeftModel.from_pretrained(base_model, "girishmanure/Qwen2.5-1.5B-SQL-Assistant") - Notebooks
- Google Colab
- Kaggle
Qwen2.5-1.5B-SQL-Assistant
This is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct for text-to-SQL generation, trained using the b-mc2/sql-create-context dataset.
Model Details
- Base Model:
Qwen/Qwen2.5-1.5B-Instruct - Fine-tuning Method: LoRA (PEFT)
- Dataset:
b-mc2/sql-create-context - Task: Text-to-SQL Generation
How to Use
Installation
pip install transformers peft trl accelerate bitsandbytes datasets torch
Example Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct"
FINE_TUNED_MODEL_NAME = "YOUR_HF_USERNAME/Qwen2.5-1.5B-SQL-Assistant" # Replace YOUR_HF_USERNAME
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True
)
# Load fine-tuned adapters
model = PeftModel.from_pretrained(base_model, FINE_TUNED_MODEL_NAME)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(FINE_TUNED_MODEL_NAME, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
def generate_sql(schema: str, question: str) -> str:
system = (
"You are a SQL expert. Given a context and a question, "
"reply with ONLY the SQL query, nothing else."
)
user = f"Schema:
{schema}
Question: {question}"
messages = [
{"role": "system", "content": system},
{"role": "user", "content": user},
]
text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text_input, return_tensors="pt").to(model.device)
model.eval()
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
top_p=0.9,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id
)
input_len = inputs["input_ids"].shape[-1]
generated_text = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True, cleanup_tokenization_spaces=True)
return generated_text.strip()
# Example usage:
context = "CREATE TABLE employees (id INT, name TEXT, department TEXT, salary INT);"
question = "List the names of employees in the Engineering department earning more than 100000."
sql_query = generate_sql(context, question)
print(f"Generated SQL:
{sql_query}")
Training Details
This model was fine-tuned for 1 epoch on a subset of the b-mc2/sql-create-context dataset. Quantization (4-bit) and LoRA were used to efficiently train the model.
Evaluation
BERTScore was used to evaluate the semantic similarity between generated and reference SQL queries.
Inference Example
Here's an example of an inference query using the fine-tuned model:
Input Schema (DDL)
CREATE TABLE buyers (
cid SERIAL PRIMARY KEY,
FNAME VARCHAR(100) NOT NULL,
SURNAME VARCHAR(100) NOT NULL,
email VARCHAR(255) NOT NULL UNIQUE,
TELEPHONE VARCHAR(25),
modify_date TIMESTAMP NOT NULL DEFAULT NOW()
);
CREATE TABLE order_header (
oid SERIAL PRIMARY KEY,
cid INTEGER NOT NULL,
oe_number VARCHAR(50) NOT NULL UNIQUE,
order_entry_date DATE NOT NULL DEFAULT CURRENT_DATE,
order_status VARCHAR(50) NOT NULL DEFAULT 'pending',
order_total_amount NUMERIC(12, 2) NOT NULL DEFAULT 0,
modify_date TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT fk_order_header_customer
FOREIGN KEY (cid) REFERENCES buyers(cid)
ON UPDATE CASCADE
ON DELETE RESTRICT
);
CREATE TABLE order_detail (
order_detail_id SERIAL PRIMARY KEY,
oid INTEGER NOT NULL,
line_number INTEGER NOT NULL,
catalog_sku VARCHAR(100) NOT NULL,
description TEXT,
quantity INTEGER NOT NULL CHECK (quantity > 0),
unit_price NUMERIC(10, 2) NOT NULL CHECK (unit_price >= 0),
line_item_total_amount NUMERIC(12, 2) GENERATED ALWAYS AS (quantity * unit_price) STORED,
modify_date TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT fk_order_detail_order
FOREIGN KEY (oid) REFERENCES order_header(oid)
ON UPDATE CASCADE
ON DELETE CASCADE,
CONSTRAINT uq_order_detail_line UNIQUE (oid, line_number)
);
Input Question
Find order details for buyer with FNAME "Girish" and SURNAME with "Manure"?
Generated SQL
SELECT *
FROM order_detail
WHERE oid IN
(SELECT oid FROM order_header WHERE cid IN
(SELECT cid FROM buyers WHERE firstname = "Girish" AND surname = "Manure") ORDER BY oid ASC LIMIT 1);
- Downloads last month
- 47