Introduction

Thiggle is a set of specialized and structured LLM APIs.

Structured Completion APIs

Regex Completion API

Given a prompt and a regex pattern, produces a constrained LLM text generation. Useful for generating specific semantic structures, typed primitives, or templates. The output is always deterministic and will always match the regex pattern provided.

Regex Completion Quickstart

We'll use the prompt "Thiggle, the specialized and structured LLM API is an acronym for " and a regex pattern to generate the acronym. We'll also set max_new_tokens to 20 to limit the output to 20 tokens. The regex pattern that corresponds to a potential acronym is T[a-z]+ H[a-z]+ I[a-z]+ G[a-z]+ G[a-z]+ L[a-z]+ E[a-z]+. For a refresher on regex, [a-z] matches any lowercase letter, and + means one or more of the preceding token. So, T[a-z]+ matches any string that starts with a capital T and is followed by one or more lowercase letters.

curl -X POST "https://api.thiggle.com/v1/completion/regex" \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $THIGGLE_API_KEY" \
   -d '{
       "prompt": "Thiggle, the specialized and structured LLM API is an acronym for ",
       "pattern": "T[a-z]+ H[a-z]+ I[a-z]+ G[a-z]+ G[a-z]+ L[a-z]+ E[a-z]+",
       "max_new_tokens": 20
   }'
{
  "completion": "Transformers Hugging Inference Generally Greatly Libraries Engine",
  "tokens_generated": 15
}

Context-Free Grammar Completion API

Given a prompt and a context-free grammar, produces a constrained LLM text generation. Useful for generating specific semantic structures, typed primitives, or templates. The output is always deterministic and will always match the context-free grammar provided.

Context-Free Grammar Completion Quickstart

In this example, we'll define a LALR grammar in Lark format (opens in a new tab) for specifying palindromes (strings that reads the same backwards and forwards).

(Remembering your computational theory, generating arbitrary-length palindromes is impossible with just regular expressions)

import os
import thiggle as tg
 
# Create an API object with your API key
api = tg.API(os.getenv("THIGGLE_API_KEY"))
grammar = """
start: palindrome
palindrome: 	letter
		| "a" palindrome "a"
		| "b" palindrome "b"
		| "c" palindrome "c"
		| "d" palindrome "d"
		| "e" palindrome "e"
		| "f" palindrome "f"
		| "g" palindrome "g"
		| "h" palindrome "h"
		| "i" palindrome "i"
		| "j" palindrome "j"
		| "k" palindrome "k"
		| "l" palindrome "l"
		| "m" palindrome "m"
		| "n" palindrome "n"
		| "p" palindrome "p"
		| "q" palindrome "q"
		| "r" palindrome "r"
		| "s" palindrome "s"
		| "t" palindrome "t"
		| "u" palindrome "u"
		| "v" palindrome "v"
		| "w" palindrome "w"
		| "x" palindrome "x"
		| "y" palindrome "y"
		| "z" palindrome "z"
 
letter: "a".."z"
"""
prompt = "Generate a palindrome: "
response = api.cfg_completion(prompt, grammar, max_new_tokens=15)
print(response)

The Categorization API

Thiggle's categorization API is a simple API that uses LLMs to categorize anything based on a prompt. It always returns structured JSON, and only returns the categories that you provide. Some example use cases include:

Categorization Quickstart

Thiggle ships with clients in Python, Go, and TypeScript, but it's also easy to use with cURL. You can get your API key from the dashboard (opens in a new tab).

curl -X POST "https://api.thiggle.com/v1/categorize" \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $THIGGLE_API_KEY" \
   -d '{
       "prompt": "What animal barks?",
       "categories": ["Dog", "Cat", "Bird", "Fish"]
   }'

Examples

AI Agents

Use the categorization API to choose the relevant tools to complete the task. Use this as a reliable building block for higher-order AI agents. Never worry about the API returning extraneous text or unknown categories that break your agent.

{
  "prompt": "What tools do I need to complete the following task? Task: find the best restaurant in San Francisco. Tools:",
  "categories": [
    "google-maps-api",
    "python-repl",
    "calculator",
    "yelp-api",
    "ffmpeg"
  ]
}
{
  "choices": ["google-maps-api", "yelp-api"]
}
Multiple-Choice Questions

Answer multiple-choice questions. For questions with more than one correct answer, use the allow_multiple_classes flag.

{
  "prompt": "What animals have four legs?",
  "categories": ["cat", "dog", "bird", "fish", "elephant", "snake"],
  "allow_multiple_classes": true
}
{
  "choices": ["cat", "dog", "elephant"]
}
Labeling Training Data

You can use the categorization API to label text for training data. For example, you could use it to label text for a text classifier. This example bins text into different buckets: ['finance', 'sports', 'politics', 'science', 'technology', 'entertainment', 'health', 'other'].

{
  "prompt": "What category does this text belong to? Text: The Dow Jones Industrial Average fell 200 points on Monday.",
  "categories": [
    "finance",
    "sports",
    "politics",
    "science",
    "technology",
    "entertainment",
    "health",
    "other"
  ]
}
{
  "choices": ["finance"]
}
Sentiment Analysis

Classify any text into sentiment classes.

{
  "prompt": "Is this a positive or negative review of Star Wars?
		The more one sees the main characters, the less appealing they become.
		Luke Skywalker is a whiner, Han Solo a sarcastic clod, Princess Leia a nag, and C-3PO just a drone.",
  "categories": ["positive", "negative"]
}
{
  "choices": ["negative"]
}

Use any sentiment categories you like. For example, you could use ["positive", "neutral", "negative"] or ["positive", "negative", "very positive", "very negative"]. Or even ["happy", "sad", "angry", "surprised", "disgusted", "fearful"].