Commit Graph

11 Commits

Author SHA1 Message Date
Jim Su
b1b96df8a8 Replace environment variables with configuration file (#339)
* Replace environment variables with configuration file

* Add config.toml to .gitignore

* Remove unused os imports

* Update README.md

* Update README.md

* Update README.md

* Fix merge conflict

* Fallback to environment variables

* Use template file for config.toml

* Update config.toml.template

* Update config.toml.template

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:26:20 -04:00
Anas DORBANI
7c27e59918 feat: Ad/regression tests using pytest (#329)
* Remove all the unnecessary files

* Create finalize the regression testing framework and add hello world test case

* Update requirements.txt

* Update the test function to execute the generate script
2024-03-28 23:40:30 -04:00
iFurySt
89abc5e253 fix: move the makefile to correct path. (#252) 2024-03-27 23:53:40 +08:00
iFurySt
8b9fc3df28 feat: add workflow to ghcr (#237)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-03-27 23:10:34 +08:00
zch-cc
e5a28cba2f Evaluation: Fix bug on python path on run.sh (#98)
* Move regression tests to evaluation/

* use pythnon instead of docker in the script

* add model para

* change python to python3

* bug fix

* add python path

* add readme
2024-03-23 00:01:48 +08:00
zch-cc
cfefc47439 Move regression tests to evaluation/ (#86)
* Move regression tests to evaluation/

* use pythnon instead of docker in the script

* add model para

* change python to python3

* bug fix
2024-03-22 23:26:37 +08:00
libowen2121
40a3614e80 Add a roadmap for eval (#92) 2024-03-22 20:27:30 +08:00
Xingyao Wang
2d5c8f1060 change to OpenDevin fork (#89) 2024-03-22 18:30:12 +08:00
Xingyao Wang
5ff96111f0 A starting point for SWE-Bench Evaluation with docker (#60)
* a starting point for SWE-Bench evaluation with docker

* fix the swe-bench uid issue

* typo fixed

* fix conda missing issue

* move files based on new PR

* Update doc and gitignore using devin prediction file from #81

* fix typo

* add a sentence

* fix typo in path

* fix path

---------

Co-authored-by: Binyuan Hui <binyuan.hby@alibaba-inc.com>
2024-03-22 12:43:49 +08:00
Jiaxin Pei
dc88dac296 adding a script to fetch and convert devin's output for evaluation (#81)
* adding code to fetch and convert devin's output for evaluation

* update README.md

* update code for fetching and processing devin's outputs

* update code for fetching and processing devin's outputs
2024-03-22 01:33:01 +08:00
Binyuan Hui
f99f4ebdaa fix: typo in the evaluation folder name. (#66) 2024-03-20 23:00:09 +08:00