PDL Engine Schema Examples and Constraints

4. If the schema is `tasklist`, it looks for a [markdown task list](https://github.github.com/gfm/#task-list-items-extension-). It extracts the task list from LLM output and converts the list to a string value, `serde_json::Value::String` in Rust. ## Examples ### An image ``` <|user|> <|media(assets/sample_image.png)|> What do you see in the image? ``` When you run this pdl file with a multi-modal model, the model will tell you what it sees in the image. ### A simple schema 1: boolean ``` <|schema|> bool <|user|> Is Rust a strictly typed programming language? Just say "true" or "false". ``` If the model's response has no "true" or "false", it will automatically prompt the model to "just say true or false". ### A simple schema 2: yes/no ``` <|schema|> yesno <|user|> Is Rust a strictly typed programming language? Just say yes or no. ``` It's like a boolean schema, but it will make the model say yes or no. It's better sometimes since most models are better at English than json. ### A simple schema 3: code ``` <|schema|> code <|user|> Write me a Python code that calculates an inverse of a matrix. Please wrap your code with 3 backticks, using markdown's fenced-code-block syntax. ``` The pdl engine will try to find a fenced code block in the LLM's output. If it cannot find one, it asks the LLM to stick to the markdown syntax. If it does, the engine extracts the code, without the fences, and returns the code. ### A schema 1: array ``` <|schema|> [int { min: 1, max: {{documents | length}} }] <|user|> Below is a list of documents. Choose documents that are related to {{topic}}. You can select an arbitrary number of documents. Your output has to be in a json format, an array of integers. If no documents are relevant, just give me an empty array. {% for document in documents %} {{loop.index}}. {{document}} {% endfor %} ``` It's a more useful example. The pdl engine will try to find an array in the LLM's output. If there's none, it asks the LLM to provide a valid json. If there's an array but the schema is wrong, it tells the LLM what's wrong with the schema and how it should be fixed. You can also give extra constraints. `{ min: 0, max: {{documents | length}} }` after `int` means that the integer has to be at least 0 and at most `{{documents | length}}`. Note that the Tera engine will convert `{{documents | length}}` to an actual length BEFORE the schema parser runs. You can see many Tera values in the prompt. You have to provide `{{documents}}` and `{{topic}}` to the pdl engine. The engine will fill the blanks with proper values. ### A schema 2: object ``` <|schema|> [{ name: string, age: integer }]{ min: {{num_students}}, max: {{num_students}} } <|user|> Below is a csv file of the students of Ragit Highschool. I want you to convert it to a json array, where the schema is `[{ "name": string, "age": integer }]`. Make sure that the array includes all the {{num_students}} students. {{csv_data}} ``` This is a simple prompt that converts a csv file to a json array. It makes sure that all the elements are json objects, where each object has two fields: "name" and "age". It also checks types. I added a constraint to the array, so that it includes all the students.

The PDL engine supports various schema types, including boolean, yes/no, code, tasklist, array, and object, with optional constraints such as min and max values, and demonstrates how to use these schema types with examples, including image embedding, code extraction, and data conversion from CSV to JSON