RubyLLM 1.4.0: Structured Output, Custom Parameters, and Rails Generators

3 months ago 14

We're shipping 1.4.0! This release brings structured output that actually works, direct access to provider-specific parameters, and Rails generators that produce idiomatic Rails code.

🎯 Structured Output with JSON Schemas

Wrestling with LLMs to return valid JSON is over. We've added with_schema that makes structured output as simple as defining what you want:

# Define your schema with the RubyLLM::Schema DSL # First: gem install ruby_llm-schema require 'ruby_llm/schema' class PersonSchema < RubyLLM::Schema string :name integer :age array :skills, of: :string end # Get perfectly structured JSON every time chat = RubyLLM.chat.with_schema(PersonSchema) response = chat.ask("Generate a Ruby developer profile") # => {"name" => "Yukihiro", "age" => 59, "skills" => ["Ruby", "C", "Language Design"]}

No more prompt engineering gymnastics. Just schemas and results. Use the RubyLLM::Schema gem for the cleanest DSL, or provide raw JSON schemas if you prefer.

🛠️ Direct Provider Access with with_params

Need to use that one provider-specific parameter? with_params gives you direct access:

# OpenAI's JSON mode chat.with_params(response_format: { type: "json_object" }) .ask("List Ruby features as JSON")

No more workarounds. Direct access to any parameter your provider supports.

🚄 Rails Generator That Produces Idiomatic Rails Code

From rails new to chatting with LLMs in under 5 minutes:

rails generate ruby_llm:install

This creates:

  • Migrations with proper Rails conventions
  • Models with acts_as_chat, acts_as_message, and acts_as_tool_call
  • A readable initializer with sensible defaults
  • Zero boilerplate, maximum convention

Your Chat model works exactly as you'd expect:

chat = Chat.create!(model: "gpt-4") response = chat.ask("Explain Ruby blocks") # Messages are automatically persisted with proper associations # Tool calls are tracked, tokens are counted

🔍 Tool Call Transparency

New on_tool_call callback lets you observe and log tool usage:

chat.on_tool_call do |tool_call| puts "🔧 AI is calling: #{tool_call.name}" puts " Arguments: #{tool_call.arguments}" # Perfect for debugging and auditing Rails.logger.info "[AI Tool] #{tool_call.name}: #{tool_call.arguments}" end chat.ask("What's the weather in Tokyo?").with_tools([weather_tool]) # => 🔧 AI is calling: get_weather # Arguments: {"location": "Tokyo"}

🔌 Raw Response Access

Access the underlying Faraday response for debugging or advanced use cases:

response = chat.ask("Hello!") # Access headers, status, timing puts response.raw.headers["x-request-id"] puts response.raw.status puts response.raw.env.duration

🏭 GPUStack Support

Run models on your own hardware with GPUStack:

RubyLLM.configure do |config| config.gpustack_api_base = 'http://localhost:8080/v1' config.gpustack_api_key = 'your-key' end chat = RubyLLM.chat(model: 'qwen3', provider: 'gpustack')

🐛 Important Bug Fixes

  • Anthropic multiple tool calls now properly handled (was only processing the first tool)
  • Anthropic system prompts fixed to use plain text instead of JSON serialization
  • Message ordering in streaming responses is rock solid
  • Embedding arrays return consistent formats for single and multiple strings
  • URL attachments work properly without argument errors
  • Streaming errors handled correctly in both Faraday V1 and V2
  • JRuby officially supported and tested

🎁 Enhanced Rails Integration

  • Message ordering guidance to prevent race conditions
  • Provider-specific configuration examples
  • Custom model name support with acts_as_ helpers
  • Improved generator output

Context isolation works seamlessly without global config pollution:

# Each request gets its own isolated configuration tenant_context = RubyLLM.context do |config| config.openai_api_key = tenant.api_key end tenant_context.chat.ask("Process this tenant's request") # Global configuration remains untouched

📚 Quality of Life Improvements

  • Removed 60MB of test fixture data
  • OpenAI base URL configuration in bin/console
  • Better error messages for invalid models
  • Enhanced Ollama documentation
  • More code examples throughout

Installation

Full backward compatibility maintained. Your existing code continues to work while new features await when you need them.

Merged PRs

  • Add OpenAI base URL config to bin/console by @infinityrobot in #283
  • Reject models from Parsera that does not have :provider or :id by @K4sku in #271
  • Fix embedding return format inconsistency for single-string arrays by @finbarr in #267
  • Fix compatibility issue with URL attachments wrong number of arguments by @DustinFisher in #250
  • Add JRuby to CI test job by @headius in #255
  • Add provider specifying example to rails guide by @tpaulshippy in #233
  • More details for configuring Ollama by @jslag in #252
  • Remove 60 MB of the letter 'a' from spec/fixtures/vcr_cassettes by @compumike in #287
  • docs: add guide for using custom model names with acts_as helpers by @matheuscumpian in #171
  • Add RubyLLM::Chat#with_params to add custom parameters to the underlying API payload by @compumike in #265
  • Support gpustack by @graysonchen in #142
  • Update CONTRIBUTING.md by @graysonchen in #289
  • Fix handling of multiple tool calls in single LLM response by @finbarr in #241
  • Rails Generator for RubyLLM Models by @kieranklaassen in #75
  • Anthropic: Fix system prompt (use plain text instead of serialized JSON) by @MichaelHoste in #302
  • Provide access to raw response object from Faraday by @tpaulshippy in #304
  • Add Chat#on_tool_call callback by @bryan-ash in #299
  • Added proper handling of streaming error responses across both Faraday V1 and V2 by @dansingerman in #273
  • Add message ordering guidance to Rails docs by @crmne in #288

New Contributors

Full Changelog: 1.3.1...1.4.0

Read Entire Article