From 5ca0beadfbec3bddf33a21623899e0118543cc14 Mon Sep 17 00:00:00 2001 From: OpenHands Date: Sun, 5 Jan 2025 05:49:38 +0900 Subject: [PATCH] Fix issue #5995: [Resolver] Resolver's summary suggests UNRESOLVED due to "no human reviewer" (#5996) Co-authored-by: Xingyao Wang Co-authored-by: Graham Neubig --- .../prompts/guess_success/issue-success-check.jinja | 8 ++++---- .../prompts/guess_success/pr-feedback-check.jinja | 8 ++++---- .../resolver/prompts/guess_success/pr-review-check.jinja | 8 ++++---- .../resolver/prompts/guess_success/pr-thread-check.jinja | 8 ++++---- 4 files changed, 16 insertions(+), 16 deletions(-) diff --git a/openhands/resolver/prompts/guess_success/issue-success-check.jinja b/openhands/resolver/prompts/guess_success/issue-success-check.jinja index 5e882cb731..a2588c4562 100644 --- a/openhands/resolver/prompts/guess_success/issue-success-check.jinja +++ b/openhands/resolver/prompts/guess_success/issue-success-check.jinja @@ -1,4 +1,4 @@ -Given the following issue description and the last message from an AI agent attempting to fix it, determine if the issue has been successfully resolved. +Given the following issue description and the last message from an AI agent attempting to fix it, determine if the issue has been successfully resolved based on the changes made and their expected impact. Make your own judgment based on the evidence provided - do NOT defer to or wait for human review. Issue description: {{ issue_context }} @@ -6,10 +6,10 @@ Issue description: Last message from AI agent: {{ last_message }} -(1) has the issue been successfully resolved? -(2) If the issue has been resolved, please provide an explanation of what was done in the PR that can be sent to a human reviewer on github. If the issue has not been resolved, please provide an explanation of why. +(1) Based on the changes made and their expected impact, has the issue been successfully resolved? +(2) Provide a clear explanation of what was done in the PR and whether it addresses the issue. Focus on the concrete changes and their expected effects, not on the need for external verification. -Answer in exactly the format below, with only true or false for success, and an explanation of the result. +Answer in exactly the format below, with only true or false for success, and an explanation of the result that focuses on the actual changes and their impact. --- success true/false diff --git a/openhands/resolver/prompts/guess_success/pr-feedback-check.jinja b/openhands/resolver/prompts/guess_success/pr-feedback-check.jinja index d6ed1e45d3..af7be5cfcd 100644 --- a/openhands/resolver/prompts/guess_success/pr-feedback-check.jinja +++ b/openhands/resolver/prompts/guess_success/pr-feedback-check.jinja @@ -1,4 +1,4 @@ -You are given one or more issue descriptions, a piece of feedback to resolve the issues, and the last message from an AI agent attempting to incorporate the feedback. If the feedback is addressed to a specific code file, then the file locations will be provided as well. Determine if the feedback has been successfully resolved. +You are given one or more issue descriptions, a piece of feedback to resolve the issues, and the last message from an AI agent attempting to incorporate the feedback. If the feedback is addressed to a specific code file, then the file locations will be provided as well. Based on the changes made and their expected impact, determine if the feedback has been successfully resolved. Make your own judgment based on the evidence provided - do NOT defer to or wait for human review. Issue descriptions: {{ issue_context }} @@ -15,10 +15,10 @@ Changes made (git patch): Last message from AI agent: {{ last_message }} -(1) has the feedback been successfully incorporated? -(2) If the feedback has been incorporated, please provide an explanation of what was done that can be sent to a human reviewer on github. If the feedback has not been resolved, please provide an explanation of why. +(1) Based on the changes made and their expected impact, has the feedback been successfully incorporated? +(2) Provide a clear explanation of what was done and whether it addresses the feedback. Focus on the concrete changes and their expected effects, not on the need for external verification. -Answer in exactly the format below, with only true or false for success, and an explanation of the result. +Answer in exactly the format below, with only true or false for success, and an explanation that focuses on the actual changes and their impact. --- success true/false diff --git a/openhands/resolver/prompts/guess_success/pr-review-check.jinja b/openhands/resolver/prompts/guess_success/pr-review-check.jinja index 1773576fc0..0820d68f52 100644 --- a/openhands/resolver/prompts/guess_success/pr-review-check.jinja +++ b/openhands/resolver/prompts/guess_success/pr-review-check.jinja @@ -1,4 +1,4 @@ -You are given one or more issue descriptions, the PR review comments, and the last message from an AI agent attempting to address the feedback. Determine if the feedback has been successfully resolved. +You are given one or more issue descriptions, the PR review comments, and the last message from an AI agent attempting to address the feedback. Based on the changes made and their expected impact, determine if the feedback has been successfully resolved. Make your own judgment based on the evidence provided - do NOT defer to or wait for human review. Issue descriptions: {{ issue_context }} @@ -12,10 +12,10 @@ Changes made (git patch): Last message from AI agent: {{ last_message }} -(1) has the feedback been successfully incorporated? -(2) If the feedback has been incorporated, please provide an explanation of what was done that can be sent to a human reviewer on github. If the feedback has not been resolved, please provide an explanation of why. +(1) Based on the changes made and their expected impact, has the feedback been successfully incorporated? +(2) Provide a clear explanation of what was done and whether it addresses the feedback. Focus on the concrete changes and their expected effects, not on the need for external verification. -Answer in exactly the format below, with only true or false for success, and an explanation of the result. +Answer in exactly the format below, with only true or false for success, and an explanation that focuses on the actual changes and their impact. --- success true/false diff --git a/openhands/resolver/prompts/guess_success/pr-thread-check.jinja b/openhands/resolver/prompts/guess_success/pr-thread-check.jinja index d0a8e2811a..7c483c2b0e 100644 --- a/openhands/resolver/prompts/guess_success/pr-thread-check.jinja +++ b/openhands/resolver/prompts/guess_success/pr-thread-check.jinja @@ -1,4 +1,4 @@ -You are given one or more issue descriptions, the PR thread comments, and the last message from an AI agent attempting to address the feedback. Determine if the feedback has been successfully resolved. +You are given one or more issue descriptions, the PR thread comments, and the last message from an AI agent attempting to address the feedback. Based on the changes made and their expected impact, determine if the feedback has been successfully resolved. Make your own judgment based on the evidence provided - do NOT defer to or wait for human review. Issue descriptions: {{ issue_context }} @@ -12,10 +12,10 @@ Changes made (git patch): Last message from AI agent: {{ last_message }} -(1) has the feedback been successfully incorporated? -(2) If the feedback has been incorporated, please provide an explanation of what was done that can be sent to a human reviewer on github. If the feedback has not been resolved, please provide an explanation of why. +(1) Based on the changes made and their expected impact, has the feedback been successfully incorporated? +(2) Provide a clear explanation of what was done and whether it addresses the feedback. Focus on the concrete changes and their expected effects, not on the need for external verification. -Answer in exactly the format below, with only true or false for success, and an explanation of the result. +Answer in exactly the format below, with only true or false for success, and an explanation that focuses on the actual changes and their impact. --- success true/false