J. Rogers, SE Ohio
1. Executive Summary
2. The Import Pipeline: A Multi-Stage Process
OCR Text: The GM provides the raw text output from an OCR scan of the adventure module.Map Tagging: The GM loads a map PNG into a special view in our tool. They then click on each numbered room/area on the image. For each click, they enter the corresponding number (e.g., click on the area labeled "1", type "1"). This creates a simple mapping: {"1": [x, y], "2": [x2, y2], ...}.This is the most critical step, as it provides the spatial context the AI lacks.
The Caller: AdventureImporter class.The Prompt's Goal: "Chunk the document." The AI is asked to identify the main sections of the text (e.g., "Introduction," "Room 1: The Grand Hall," "Room 13: The Lich's Phylactery," "Appendix: New Monsters") and return a JSON object that splits the raw text into these logical blocks.
The Context: The Importer grabs the text for "Room 1: The Grand Hall" and looks up the coordinates for "1" from the Map Tagging step.The Prompt's Goal: "Extract structured data for this specific room." The prompt is highly specific:"You are a data extraction tool for a TTRPG. Analyze the following text for Room 1. The room is located at map coordinates [x, y]. Extract the following information and return it as a single, valid JSON object with no other text. Text: [... text for Room 1 ...]JSON Schema to use: JSON {"room_id": "1", "title": "The Grand Hall", "description": "The visual and atmospheric description of the room.", "monsters": [ { "name": "Goblin Archer", "count": 2 } ], "traps": [ { "name": "Pit Trap", "dc": 15, "description": "..." } ], "treasure": [ "35 gold pieces", "a Potion of Healing" ], "portals": [ { "to_room_id": "3", "description": "a locked oak door" } ] }The Dumb Pipe: The AIManager sends this prompt and gets the JSON back.
The Importer collects the JSON object for every room. It then creates the necessary records in the database: A new node of type dungeon_level is created. Its metadata gets the "Introduction" text. For each room's JSON, a marker is created on that node at the tagged coordinates ([x, y]). The marker.title is set to the room's title. The marker.description is set to the description. The rest of the data (monsters, traps, treasure, portals) is stored in the marker.metadata field.
3. Key Challenges & Solutions
Challenge: OCR errors ("Ghola" instead of "Ghoul").Solution: LLMs are remarkably robust at correcting these based on context. The prompt will specify a fantasy setting, which helps the model infer the correct terms.
Challenge: Inconsistent formatting in adventure modules (e.g., Monsters:, Creatures Found:, or just bold text).Solution: This is where the LLM excels over traditional regex. The prompt will instruct the AI to identify entities semantically ("look for creatures mentioned in this section") rather than relying on specific keywords.
Challenge: Linking rooms correctly ("a secret door leads to area 13").Solution: The "portals" key in our JSON schema explicitly asks the AI to identify these connections. During the final assembly stage, our Importer can validate these links (i.e., does a marker for "13" actually exist?).
4. Conclusion
A "dumb pipe" AIManager that just handles the AI call. A new, highly specialized "caller" (AdventureImporter) that owns the complex, multi-stage prompt logic. A flexible database schema (nodes and markers with JSON metadata) that can store the extracted, structured data without modification.
No comments:
Post a Comment