Propose a Prompt Engineering Evaluation Metrics (PEEM), a novel framework designed to assess the prompt's linguistic quality and task alignment.
To validate PEEM's evaluation capabilities, we applied PEEM's result to guide prompt rewriting.
To further challenge our method, we compared existing prompt optimization methods with PEEM-based rewriting.
We conducted experiments to demonstrate that our framework achieves comparable performance with reinforcement learning-based fine-tuning methods, even within zero-shot settings.
Outcome
The 34th International Joint Conference on Artificial Intelligence (IJCAI 2025) / Submitted