update readme

2026-03-22 04:57:18 +08:00 · 2025-03-18 09:51:52 +08:00
parent 354eb7113b
commit 5e0be922f3
14 changed files with 52 additions and 327184 deletions
--- a/README.md
+++ b/README.md
@@ -11,15 +11,6 @@

 https://github.com/user-attachments/assets/bf27f8bd-136b-402e-bc7d-994b99bcc368

-Support Model:
-
-| Vendor| Model |
-| --- | --- |
-| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
-|[yeka](https://2233.ai/api)|gpt-4o,o1|
-|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|
-
-

 </div>

@@ -73,6 +64,17 @@ python main.py
 ```
 Then open `http://localhost:7888/` in your browser to configure your API key and basic settings.

+
+Support model:
+
+
+| Vendor| Model |
+| --- | --- |
+| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
+|[yeka](https://2233.ai/api)|gpt-4o,o1|
+|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|
+
+
 ## 📝 FAQ

 ### 🔧 CUDA Version Mismatch
@@ -86,15 +88,10 @@ We recommend using an NVIDIA graphics card with at least 4GB of VRAM, although y
 For example, if your CUDA version is 12.4, install torch using this command:

 ```bash
+pip3 uninstall -y torch torchvision
 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
 ```

-### Model Download Issues
-If you're having trouble downloading models (possibly due to network restrictions), you can download them directly from Baidu Cloud:
-
-File: weights.zip
-Link: https://pan.baidu.com/s/1Tj8sZZK9_QI7whZV93vb0w?pwd=dyeu
-Password: dyeu

 ## 🤝 Contributing

@@ -102,8 +99,6 @@ Every excellent open-source project embodies collective wisdom. autoMate's growt

 Join us in creating a more intelligent future.

-> Strongly recommend reading ["How To Ask Questions The Smart Way"](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way), ["How to Ask Questions to Open Source Community"](https://github.com/seajs/seajs/issues/545), ["How to Report Bugs Effectively"](http://www.chiark.greenend.org.uk/%7Esgtatham/bugs.html), and ["How to Submit Unanswerable Questions to Open Source Projects"](https://zhuanlan.zhihu.com/p/25795393) for better support.
-
 <a href="https://github.com/yuruotong1/autoMate/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=yuruotong1/autoMate" />
 </a>
--- a/README_CN.md
+++ b/README_CN.md
@@ -9,17 +9,6 @@
 https://github.com/user-attachments/assets/bf27f8bd-136b-402e-bc7d-994b99bcc368


-目前支持的模型:
-
-
-| Vendor| Model |
-| --- | --- |
-| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
-|[yeka](https://2233.ai/api)|gpt-4o,o1|
-|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|
-
-
-
 </div>

 > 特别声明：autoMate 项目还处于非常早期阶段，目前的能力还不足以解决任何问题，当前仅限于学习和交流。不过我会不断的寻求突破点，不停地融入最新的技术！如果你有任何疑问，也可以加vx好友，入群交流。
@@ -71,6 +60,18 @@ python main.py
 然后在浏览器中打开`http://localhost:7888/`，配置您的API密钥和基本设置。


+目前支持的模型如下:
+
+
+| Vendor| Model |
+| --- | --- |
+| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
+|[yeka](https://2233.ai/api)|gpt-4o,o1|
+|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|
+
+
+
+
 ## 📝常见问题

 ### 🔧CUDA版本不匹配问题
@@ -84,20 +85,17 @@ python main.py
 比如我的 cuda 版本为 12.4，需要按照如下命令来安装 torch；

 ```bash
+pip3 uninstall -y torch torchvision
 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
 ```


-
-
 ## 🤝 参与共建

 每一个优秀的开源项目都凝聚着集体的智慧。autoMate的成长离不开你的参与和贡献。无论是修复bug、添加功能，还是改进文档，你的每一份付出都将帮助成千上万的人摆脱重复性工作的束缚。

 加入我们，一起创造更加智能的未来。

-> 强烈推荐阅读 [《提问的智慧》](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way)、[《如何向开源社区提问题》](https://github.com/seajs/seajs/issues/545) 和 [《如何有效地报告 Bug》](http://www.chiark.greenend.org.uk/%7Esgtatham/bugs-cn.html)、[《如何向开源项目提交无法解答的问题》](https://zhuanlan.zhihu.com/p/25795393)，更好的问题更容易获得帮助。
-
 <a href="https://github.com/yuruotong1/autoMate/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=yuruotong1/autoMate" />
 </a>
--- a/README_JA.md
+++ b/README_JA.md
@@ -11,14 +11,6 @@

 https://github.com/user-attachments/assets/bf27f8bd-136b-402e-bc7d-994b99bcc368

-対応モデル：
-
-| ベンダー | モデル |
-| --- | --- |
-| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
-|[yeka](https://2233.ai/api)|gpt-4o,o1|
-|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|
-
 </div>

 > 特記事項：autoMateプロジェクトは現在、非常に初期段階にあります。現時点での機能は限定的で、主に学習とコミュニケーションを目的としています。しかし、私たちは継続的にブレークスルーを追求し、最新技術を統合しています！ご質問がありましたら、WeChatでお気軽にお問い合わせください。
@@ -83,13 +75,19 @@ python main.py
 例えば、CUDAバージョンが12.4の場合、以下のコマンドでtorchをインストール：

 ```bash
+pip3 uninstall -y torch torchvision
 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
 ```

+現在サポートされているモデルは以下の通りです:
+
+
+| Vendor| Model |
+| --- | --- |
+| [openainext](https://api.openai-next.com) | gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20 |
+|[yeka](https://2233.ai/api)|gpt-4o,o1|
+|openai|gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-11-20,o1,4.gpt-4.5-preview-2025-02-27,|

-ファイル：weights.zip
-リンク：https://pan.baidu.com/s/1Tj8sZZK9_QI7whZV93vb0w?pwd=dyeu
-パスワード：dyeu

 ## 🤝 コントリビューション

@@ -97,8 +95,6 @@ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.o

 より知的な未来を一緒に創造しましょう。

-> より良いサポートを受けるために、["賢い質問の仕方"](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way)、["オープンソースコミュニティへの質問方法"](https://github.com/seajs/seajs/issues/545)、["効果的なバグ報告の方法"](http://www.chiark.greenend.org.uk/%7Esgtatham/bugs.html)、["オープンソースプロジェクトへの回答不能な質問の提出方法"](https://zhuanlan.zhihu.com/p/25795393)をお読みいただくことを強くお勧めします。
-
 <a href="https://github.com/yuruotong1/autoMate/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=yuruotong1/autoMate" />
 </a>
--- a/gradio_ui/agent/vision_agent.py
+++ b/gradio_ui/agent/vision_agent.py
@@ -21,7 +21,7 @@ class UIElement(BaseModel):
    text: Optional[str] = None

 class VisionAgent:
-    def __init__(self, yolo_model_path: str, caption_model_path: str):
+    def __init__(self, yolo_model_path: str, florence_model_path: str):
        """
        Initialize the vision agent
        
@@ -36,43 +36,24 @@ class VisionAgent:
        
        # load the image caption model and processor
        self.caption_processor = AutoProcessor.from_pretrained(
-            "weights/AI-ModelScope/Florence-2-base-ft", 
-            trust_remote_code=True,
-            local_files_only=True
-        )
-        config = AutoConfig.from_pretrained(
-            "weights/AI-ModelScope/Florence-2-base-ft",  # 指向包含 configuration_florence2.py 的目录
+            florence_model_path, 
            trust_remote_code=True,
            local_files_only=True
        )
+
        
-        try:
-            # 修改：加载模型和权重都从 florence 目录
-            florence_base_path = "weights/AI-ModelScope/Florence-2-base-ft"
-            
-            # 直接从 florence 目录完整加载模型（包括权重）
+        try:            
            self.caption_model = AutoModelForCausalLM.from_pretrained(
-                florence_base_path,  # 这里使用包含代码和权重的完整目录
+                florence_model_path,  # 这里使用包含代码和权重的完整目录
                torch_dtype=self.dtype,
                trust_remote_code=True,
                local_files_only=True
            ).to(self.device)
-            "processor", 
-            trust_remote_code=True
-        )
-        
-        try:
-            self.caption_model = AutoModelForCausalLM.from_pretrained(
-                caption_model_path, 
-                torch_dtype=self.dtype,
-                trust_remote_code=True
-            ).to(self.device)
            
            # 不需要额外加载权重，因为权重已经包含在 florence_base_path 中
            
        except Exception as e:
            print(f"Model loading failed: {e}")
-            print(f"Model loading failed for path: {caption_model_path}")
            raise e
        self.prompt = "<CAPTION>"
        
--- a/gradio_ui/app.py
+++ b/gradio_ui/app.py
@@ -14,7 +14,7 @@ from gradio_ui.loop import (
 import base64
 from xbrain.utils.config import Config

-from util.download_weights import OMNI_PARSER_MODEL_DIR
+from util.download_weights import OMNI_PARSER_DIR, FLORENCE_DIR
 CONFIG_DIR = Path("~/.anthropic").expanduser()
 API_KEY_FILE = CONFIG_DIR / "api_key"

@@ -320,8 +320,8 @@ def run():
        model.change(fn=update_model, inputs=[model, state], outputs=None)
        api_key.change(fn=update_api_key, inputs=[api_key, state], outputs=None)
        chatbot.clear(fn=clear_chat, inputs=[state], outputs=[chatbot])
-        vision_agent = VisionAgent(yolo_model_path=os.path.join(OMNI_PARSER_MODEL_DIR, "icon_detect", "model.pt"),
-                                  caption_model_path=os.path.join(OMNI_PARSER_MODEL_DIR, "icon_caption"))
+        vision_agent = VisionAgent(yolo_model_path=os.path.join(OMNI_PARSER_DIR, "icon_detect", "model.pt"),
+                                  florence_model_path=FLORENCE_DIR)
        vision_agent_state = gr.State({"agent": vision_agent})
        submit_button.click(process_input, [chat_input, state, vision_agent_state], [chatbot, task_list])
        stop_button.click(stop_app, [state], None)
--- a/main.py
+++ b/main.py
@@ -1,6 +1,3 @@
-# import os
-# os.environ["HF_ENDPOINT"] = "https://hf-mirror.com/"
-
 from gradio_ui import app
 from util import download_weights

@@ -11,9 +8,7 @@ def run():
        print("Warning: GPU is not available, we will use CPU, the application may run slower!\nyou computer will very likely heat up!")
    print("Downloading the weight files...")
    # download the weight files
-    # download_weights.download_models()  
-    # 配置 HuggingFace 镜像
-    # print("HuggingFace mirror configured to use ModelScope registry")
+    download_weights.download()  
    app.run()


--- a/processor/added_tokens.json
+++ b/processor/added_tokens.json
--- a/processor/merges.txt
+++ b/processor/merges.txt
--- a/processor/preprocessor_config.json
+++ b/processor/preprocessor_config.json
@@ -1,33 +0,0 @@
-{
-  "auto_map": {
-    "AutoProcessor": "microsoft/Florence-2-base--processing_florence2.Florence2Processor"
-  },
-  "crop_size": {
-    "height": 768,
-    "width": 768
-  },
-  "do_center_crop": false,
-  "do_convert_rgb": null,
-  "do_normalize": true,
-  "do_rescale": true,
-  "do_resize": true,
-  "image_mean": [
-    0.485,
-    0.456,
-    0.406
-  ],
-  "image_processor_type": "CLIPImageProcessor",
-  "image_seq_length": 577,
-  "image_std": [
-    0.229,
-    0.224,
-    0.225
-  ],
-  "processor_class": "Florence2Processor",
-  "resample": 3,
-  "rescale_factor": 0.00392156862745098,
-  "size": {
-    "height": 768,
-    "width": 768
-  }
-}
--- a/processor/special_tokens_map.json
+++ b/processor/special_tokens_map.json
--- a/processor/tokenizer.json
+++ b/processor/tokenizer.json
--- a/processor/tokenizer_config.json
+++ b/processor/tokenizer_config.json
--- a/processor/vocab.json
+++ b/processor/vocab.json
--- a/util/download_weights.py
+++ b/util/download_weights.py
@@ -1,22 +1,21 @@
 import os
 from pathlib import Path
 from modelscope import snapshot_download
-__WEIGHTS_DIR = Path("weights")
-OMNI_PARSER_MODEL_DIR = os.path.join(__WEIGHTS_DIR, "AI-ModelScope", "OmniParser-v2___0")

-def __download_omni_parser():
+__WEIGHTS_DIR = Path("weights")
+OMNI_PARSER_DIR = os.path.join(__WEIGHTS_DIR, "AI-ModelScope", "OmniParser-v2___0") 
+FLORENCE_DIR = os.path.join(__WEIGHTS_DIR, "AI-ModelScope", "Florence-2-base-ft")
+def download():
    # Create weights directory
    __WEIGHTS_DIR.mkdir(exist_ok=True)
-    
    snapshot_download(
        'AI-ModelScope/OmniParser-v2.0',
-        cache_dir='weights'
+        cache_dir='weights',
        )
    
-    
-def download_models():
-    __download_omni_parser()
-
+    snapshot_download(
+        'AI-ModelScope/Florence-2-base-ft',
+        cache_dir='weights' )

 if __name__ == "__main__":
-    download_models()
+    download()