Finetuning LLM for Text-to-SQL generation

I just completed a project that lets people ask database questions in plain English and get back proper SQL queries using a fine-tuned large language model. For the base model, I chose Mistral-7B-v3 and fine-tuned it specifically for SQL generation. Using QLoRA for efficient training, I was able to train the 7-billion parameter model on a single consumer-grade GPU (Nvidia Tesla P100) in around 3 hours. The resulting model performs well on common SQL patterns like filtering, joins, and aggregations, effectively handling the majority of real-world database queries. That said, it can be less accurate for complex subqueries or really intricate nested queries due to the limitations of the Mistral-7B model —- a larger model would handle these cases better, but this was a tradeoff between performance and computational requirements. ...

July 24, 2025 · 2 min · 235 words · Li Cao

Curiosity is (Almost) All You Need

The landscape of learning has been fundamentally transformed. In an era where Large Language Models can generate code and explain complex concepts, the traditional barriers to learning have largely disappeared. What remains—and what has become more important than ever—is curiosity. The Great Democratization Not too long ago, learning new technologies or skills required: Access to expensive courses or textbooks Mentorship from experienced practitioners Trial and error through countless hours of debugging Physical presence in classrooms or labs Today, anyone with internet access can have a conversation with an AI that knows more about programming, mathematics, science, and virtually any field than most human experts. The means of learning are no longer the bottleneck—curiosity and the drive to learn are. ...

June 15, 2025 · 4 min · 825 words · Li Cao