Manual research synthesis is slow and prone to error. Learn how cumulative prompting and strict JSON validation create a robust pipeline for studies.

The script is essentially building a specialized expert out of a general-purpose model, one layer of context at a time, by providing a total environment—a sandbox of information—to ensure the highest possible chance of a valid, scientifically useful response.
https://github.com/carljuneau/scaiences/blob/master/studies%2Fllm-rob%2Fsrc%2Frun_models.py


Cumulative prompting is a method of building an AI's context in stages, where each subsequent level of instruction retains all previous information while adding new, more complex layers. In this specific script, the process is divided into four conditions: Condition A provides the baseline study and ID; Condition B adds a specific JSON schema and risk-of-bias criteria; Condition C introduces "training material" or methodological textbooks; and Condition D adds a "worked example" featuring a completed assessment of a separate study to serve as a gold-standard reference for the model.
The script employs a "guilty until proven innocent" approach to AI responses through a strict validation gate. After the model generates a response, the script parses the JSON to ensure it matches the required schema and verifies that the study_id in the output matches the input study to prevent hallucinations. Furthermore, the script requires the AI to provide a "verbatim quote" from the PDF for every judgment made, forcing the model to anchor its scientific claims in the actual text of the study.
The PROMPT_BRIDGE is a specific string used as a linguistic marker to help the model transition from reading training materials or examples to performing the actual task on the target study, which is crucial for maintaining focus in long-context windows. Setting the temperature to zero ensures the model's output is deterministic and consistent rather than creative. In a research context, this "boring" consistency is preferred because it prioritizes accuracy and reproducibility over varied or "fun" responses.
The script uses a robust two-strike system for errors. If a response fails to parse or validate on the first attempt, the script logs the error and automatically tries a second time. If it fails again, the script records the failure in a detailed CSV log (parse_failures.csv) including the study ID, model used, and the specific error message. This allows researchers to audit the process and identify if certain models or prompts are consistently struggling with specific types of data.
The PromptAssetResolver is a dynamic discovery tool that allows the script to find necessary files—like training PDFs or example JSONs—without the researcher having to hardcode every file path. It uses a scoring system to search through directories for keywords like "mulder" or "green" to identify the best-fit materials for the prompt. This makes the script flexible and portable, allowing different researchers to use the same logic even if their file naming conventions vary slightly, while still "failing loudly" if it finds ambiguous or conflicting files.
From Columbia University alumni built in San Francisco
"Instead of endless scrolling, I just hit play on BeFreed. It saves me so much time."
"I never knew where to start with nonfiction—BeFreed’s book lists turned into podcasts gave me a clear path."
"Perfect balance between learning and entertainment. Finished ‘Thinking, Fast and Slow’ on my commute this week."
"Crazy how much I learned while walking the dog. BeFreed = small habits → big gains."
"Reading used to feel like a chore. Now it’s just part of my lifestyle."
"Feels effortless compared to reading. I’ve finished 6 books this month already."
"BeFreed turned my guilty doomscrolling into something that feels productive and inspiring."
"BeFreed turned my commute into learning time. 20-min podcasts are perfect for finishing books I never had time for."
"BeFreed replaced my podcast queue. Imagine Spotify for books — that’s it. 🙌"
"It is great for me to learn something from the book without reading it."
"The themed book list podcasts help me connect ideas across authors—like a guided audio journey."
"Makes me feel smarter every time before going to work"
From Columbia University alumni built in San Francisco
