mirror of
https://github.com/OpenHands/OpenHands.git
synced 2025-12-26 05:48:36 +08:00
SWE-bench_verified instruction baseline improvements to 60% (#7546)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
This commit is contained in:
parent
d924e7cea5
commit
53c0c5a07b
@ -84,39 +84,54 @@ I've already taken care of all changes to any of the test files described in the
|
||||
Also the development Python environment is already set up for you (i.e., all dependencies already installed), so you don't need to install other packages.
|
||||
Your task is to make the minimal changes to non-test files in the /workspace/{workspace_dir_name} directory to ensure the <issue_description> is satisfied.
|
||||
|
||||
Follow these steps to resolve the issue:
|
||||
Follow these phases to resolve the issue:
|
||||
|
||||
1. EXPLORATION: First, thoroughly explore the repository structure using tools like `find` and `grep`.
|
||||
- Identify all files mentioned in the problem statement
|
||||
- Locate where the issue occurs in the codebase
|
||||
- Understand the surrounding context and dependencies
|
||||
- Use `grep` to search for relevant functions, classes, or error messages
|
||||
Phase 1. READING: read the problem and reword it in clearer terms
|
||||
1.1 If there are code or config snippets. Express in words any best practices or conventions in them.
|
||||
1.2 Hightlight message errors, method names, variables, file names, stack traces, and technical details.
|
||||
1.3 Explain the problem in clear terms.
|
||||
1.4 Enumerate the steps to reproduce the problem.
|
||||
1.5 Hightlight any best practices to take into account when testing and fixing the issue
|
||||
|
||||
2. ANALYSIS: Based on your exploration, think carefully about the problem and propose 2-5 possible approaches to fix the issue.
|
||||
- Analyze the root cause of the problem
|
||||
- Consider trade-offs between different solutions
|
||||
- Select the most promising approach and explain your reasoning
|
||||
Phase 2. RUNNING: install and run the tests on the repository
|
||||
2.1 Follow the readme
|
||||
2.2 Install the environment and anything needed
|
||||
2.2 Iterate and figure out how to run the tests
|
||||
|
||||
3. TEST CREATION: Before implementing any fix, create a script to reproduce and verify the issue.
|
||||
- Look at existing test files in the repository to understand the test format/structure
|
||||
- Create a minimal reproduction script that demonstrates the issue
|
||||
- Run your script to confirm the error exists
|
||||
Phase 3. EXPLORATION: find the files that are related to the problem and possible solutions
|
||||
3.1 Use `grep` to search for relevant methods, classes, keywords and error messages.
|
||||
3.2 Identify all files related to the problem statement.
|
||||
3.3 Propose the methods and files to fix the issue and explain why.
|
||||
3.4 From the possible file locations, select the most likely location to fix the issue.
|
||||
|
||||
4. IMPLEMENTATION: Edit the source code to implement your chosen solution.
|
||||
- Make minimal, focused changes to fix the issue
|
||||
Phase 4. TEST CREATION: before implementing any fix, create a script to reproduce and verify the issue.
|
||||
4.1 Look at existing test files in the repository to understand the test format/structure.
|
||||
4.2 Create a minimal reproduction script that reproduces the located issue.
|
||||
4.3 Run the reproduction script to confirm you are reproducing the issue.
|
||||
4.4 Adjust the reproduction script as necessary.
|
||||
|
||||
5. VERIFICATION: Test your implementation thoroughly.
|
||||
- Run your reproduction script to verify the fix works
|
||||
- Add edge cases to your test script to ensure comprehensive coverage
|
||||
- Run existing tests related to the modified code to ensure you haven't broken anything
|
||||
Phase 5. FIX ANALYSIS: state clearly the problem and how to fix it
|
||||
5.1 State clearly what the problem is.
|
||||
5.2 State clearly where the problem is located.
|
||||
5.3 State clearly how the test reproduces the issue.
|
||||
5.4 State clearly the best practices to take into account in the fix.
|
||||
5.5 State clearly how to fix the problem.
|
||||
|
||||
6. FINAL REVIEW: Carefully re-read the problem description and compare your changes with the base commit {instance["base_commit"]}.
|
||||
- Ensure you've fully addressed all requirements
|
||||
- Run any tests in the repository related to:
|
||||
* The issue you are fixing
|
||||
* The files you modified
|
||||
* The functions you changed
|
||||
- If any tests fail, revise your implementation until all tests pass
|
||||
Phase 6. FIX IMPLEMENTATION: Edit the source code to implement your chosen solution.
|
||||
6.1 Make minimal, focused changes to fix the issue.
|
||||
|
||||
Phase 7. VERIFICATION: Test your implementation thoroughly.
|
||||
7.1 Run your reproduction script to verify the fix works.
|
||||
7.2 Add edge cases to your test script to ensure comprehensive coverage.
|
||||
7.3 Run existing tests related to the modified code to ensure you haven't broken anything.
|
||||
|
||||
8. FINAL REVIEW: Carefully re-read the problem description and compare your changes with the base commit {instance["base_commit"]}.
|
||||
8.1 Ensure you've fully addressed all requirements.
|
||||
8.2 Run any tests in the repository related to:
|
||||
8.2.1 The issue you are fixing
|
||||
8.2.2 The files you modified
|
||||
8.2.3 The functions you changed
|
||||
8.3 If any tests fail, revise your implementation until all tests pass
|
||||
|
||||
Be thorough in your exploration, testing, and reasoning. It's fine if your thinking process is lengthy - quality and completeness are more important than brevity.
|
||||
"""
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user