Files
Abstract
This thesis investigates whether large language models (LLMs) can improve their forecasting capabilities through self-reflection. While recent studies have explored LLMs' ability to predict future events, they typically rely on a limited set of human-crafted prompts. This thesis tests whether LLMs can iteratively analyze their forecasting performance and refine their prompting strategies. Testing nine LLMs, including recent reasoning models, this thesis finds that models are not successful at using self-reflection to improve their forecasting performance.