Skip to contents

Enable LLM to draft and execute SQL queries on a database

Usage

answer_using_sql(
  prompt,
  add_text = paste0("You must code in SQL to answer this prompt.",
    " You must provide all SQL code between ```sql and ```.", "\n\n",
    "Never make assumptions about the possible values in the tables.\n",
    "Instead, execute SQL queries to retrieve information you need."),
  conn,
  list_tables = TRUE,
  describe_tables = TRUE,
  evaluate_code = FALSE,
  output_as_tool = FALSE,
  return_mode = c("full", "code", "object", "formatted_output", "llm_answer")
)

Arguments

prompt

A single string or a tidyprompt() object

add_text

Single string which will be added to the prompt text, informing the LLM that they must use SQL to answer the prompt

conn

A DBIConnection object to the SQL database

list_tables

Logical indicating whether to list tables available in the database in the prompt text

describe_tables

Logical indicating whether to describe the tables available in the database in the prompt text. If TRUE, the columns of each table will be listed

evaluate_code

Logical indicating whether to evaluate the SQL code. If TRUE, the SQL code will be executed on the database and the results will be returned. Use with caution, as this allows the LLM to execute arbitrary SQL code

output_as_tool

Logical indicating whether to return the output as a tool result. If TRUE, the output of the SQL query will be sent back to the LLM as a tool result. The LLM can then provide a final answer or try another query. This can continue until the LLM provides a final answer without any SQL code

return_mode

Character string indicating the return mode. Options are:

  • "full": Return a list containing the SQL code, output, and formatted output

  • "code": Return only the SQL code

  • "object": Return only the query result object

  • "formatted_output": Return the formatted output: a string detailing the SQL code and query result object.This is identical to how the LLM would see the output when output_as_tool is TRUE

  • "llm_answer": Return the LLM answer. If output as tool is TRUE, the return mode will always be "llm_answer" (since the LLM uses SQL to provide a final answer)

Value

A tidyprompt() with an added prompt_wrap() which will ensure that the LLM will use SQL to answer the prompt

Examples

if (FALSE) { # \dontrun{
  # Create an in-memory SQLite database
  conn <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

  # Create a sample table of customers
  DBI::dbExecute(conn, "
  CREATE TABLE
    customers (
      id INTEGER PRIMARY KEY,
      name TEXT,
      email TEXT,
      country TEXT
    );
  ")

  # Insert some sample customer data
  DBI::dbExecute(conn, "
  INSERT INTO
    customers (name, email, country)
  VALUES
    ('Alice', 'alice@example.com', 'USA'),
    ('Bob', 'bob@example.com', 'Canada'),
    ('Charlie', 'charlie@example.com', 'UK'),
    ('Diana', 'diana@example.com', 'USA');
  ")

  # Create another sample table for orders
  DBI::dbExecute(conn, "
  CREATE TABLE orders (
    order_id INTEGER PRIMARY KEY,
    customer_id INTEGER,
    product TEXT,
    amount REAL,
    order_date TEXT,
    FOREIGN KEY(customer_id) REFERENCES customers(id)
  );
  ")

  # Insert some sample orders
  DBI::dbExecute(conn, "
  INSERT INTO
    orders (customer_id, product, amount, order_date)
  VALUES
    (1, 'Widget', 19.99, '2024-01-15'),
    (1, 'Gadget', 29.99, '2024-01-17'),
    (2, 'Widget', 19.99, '2024-02-10'),
    (3, 'SuperWidget', 49.99, '2024-03-05'),
    (4, 'Gadget', 29.99, '2024-04-01'),
    (1, 'Thingamajig', 9.99, '2024-04-02');
  ")

  # Ask LLM a question which it will answer using the SQL database:
  "Where are my customers from?" |>
    answer_using_sql(
      conn = conn,
      evaluate_code = TRUE,
      output_as_tool = TRUE
    ) |>
    send_prompt(llm_provider_openai())
  # --- Sending request to LLM provider (gpt-4o-mini): ---
  # Where are my customers from?
  #
  # You must code in SQL to answer this prompt. You must provide all SQL code
  # between ```sql and ```.
  #
  # Never make assumptions about the possible values in the tables.
  # Instead, execute SQL queries to retrieve information you need.
  #
  # These tables are available in the database:
  #   customers, orders
  #
  # Table descriptions:
  #   - customers
  # Columns: id, name, email, country
  #
  # - orders
  # Columns: order_id, customer_id, product, amount, order_date
  #
  # Your SQL query will be executed on the database. The results will be sent back
  # to you. After seeing the results, you can either provide a final answer or try
  # another SQL query. When you provide your final answer, do not include any SQL code.
  # --- Receiving response from LLM provider: ---
  # ```sql
  # SELECT DISTINCT country FROM customers;
  # ```
  # --- Sending request to LLM provider (gpt-4o-mini): ---
  # --- SQL code: ---
  # SELECT DISTINCT country FROM customers;
  #
  # --- Query results: ---
  #   country
  # 1     USA
  # 2  Canada
  # 3      UK
  # --- Receiving response from LLM provider: ---
  # Based on the query results, your customers are from the following countries:
  # USA, Canada, and UK.
  # [1] "Based on the query results, your customers are from the following countries:
  # USA, Canada, and UK."
} # }