XHS-Downloader

简体中文 | English

JoeanAmier%2FXHS-Downloader | Trendshift
GitHub GitHub forks GitHub Repo stars Static Badge
Static Badge GitHub code size in bytes GitHub release (with filter) GitHub all releases

🔥 RedNote Link Extraction/Content Collection Tool:Extract account-published, favorites, and liked works links; extract search result works links and user links; collect RedNote works information; extract RedNote works download addresses; download RedNote watermark-free works files!

🔥 "RedNote", "XiaoHongShu" and "小红书" have the same meaning, and this project is collectively referred to as "RedNote".

⭐ This project is completely free and open-source, with no paid features. Please do not be deceived!

⭐ Due to the author's limited energy, I was unable to update the English document in a timely manner, and the content may have become outdated, partial translation is machine translation, the translation result may be incorrect, Suggest referring to Chinese documentation. If you want to contribute to translation, we warmly welcome you.

📑 Project Features

⭐ The development plan and progress of XHS-Downloader can be found at Projects

📸 Program Screenshots

🎥 Click the images to watch the demo video



🔗 Supported Links

🪟 About the Terminal

⭐ It is recommended to use the Windows Terminal (default terminal for Windows 11) to run the program for the best display effect!

🥣 Usage

If you only need to download watermark-free works files, it is recommended to choose Program Run; if you have other needs, it is recommended to choose Source Code Run!

Starting from version 2.2, if there are no abnormalities in project functionality, there is no need to handle cookies separately!

🖱 Program Run

⭐ Mac OS, Windows 10 and above users can go to Releases or Actions to download the program package, unzip it, open the program folder, and double-click to run main to use.

⭐ This project includes GitHub Actions for automatic building executable files. Users can use GitHub Actions to build the latest source code into executable files at any time!

Note: The executable file main for Mac OS may need to be launched from the terminal command line; Due to device limitations, the Mac OS executable file has not been tested and its availability cannot be guaranteed!

If you use the program in this way, the default download path for files is: .\_internal\Download; the configuration file path is: .\_internal\settings.json

⌨️ Docker Run

  1. Get Image
  2. Create Container
  3. Run Container

When running the project via Docker, the command line call mode is not supported. The clipboard reading and clipboard monitoring functions are unavailable, but pasting content works fine. Please provide feedback if other features are not functioning properly!

⌨️ Source Code Run

    [//]: # (
  1. Install the Python interpreter with a version no lower than 3.12
  2. )
  3. Install Python interpreter with version 3.12
  4. Download the latest source code of this project or the source code released in Releases to your local machine
  5. Open the terminal and switch to the root path of the project
  6. Run the command pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt to install the required modules
  7. Run main.py to use

🛠 Command Line Mode

The project supports command line mode. If you want to download specific images from a text and image work, you can use this mode to set the image sequence number you want to download!

You can use the command line to read cookies from the browser and write to the configuration file!

Command example: python .\main.py --browser_cookie Chrome --update_settings

The bool type parameters support setting with true, false, 1, 0, yes, no, on or off (case insensitive).



🖥 Server Mode

Start: Run the command: python .\main.py server

Stop: Press Ctrl + C to stop the server

Open http://127.0.0.1:6666/docs or http://127.0.0.1:6666/redoc; you will see automatically generated interactive API documentation!

Request endpoint: /xhs/

Request method: POST

Request format: JSON

Request parameters:

Parameter Type Description Default
url str RedNote works link, auto-extraction, does not support multiple links; Required parameter None
download bool Whether to download the works file; set to true will take more time; Optional parameter false
index list[int] Download specific image files by index, only effective for text and image works; not effective when the download parameter is set to false; Optional parameter null
cookie str Cookies used when requesting data; Optional parameter Settings Cookie Value
skip bool Whether to skip works with download records; set to true will not return works data with download records; Optional parameter false

Code example:

def api_demo():
    server = "http://127.0.0.1:6666/xhs/"
    data = {
        "url": "https://www.xiaohongshu.com/explore/123456789",
        "download": True,
        "index": [
            3,
            6,
            9,
        ],
    }
    response = requests.post(server, json=data)
    print(response.json())

📜 Others

🕹 User Script

If your browser has the Tampermonkey browser extension installed, you can add the user script to experience the project features without needing to download or install anything!

After successfully installing the script, open the RedNote page, check the script instructions, and follow the prompts to operate.


Note: Using the XHS-Downloader user script to batch extract works links, in combination with the XHS-Downloader program, can achieve batch downloading of watermark-free works files!

📜 Script Instructions

The automatic page scroll feature has been refactored and is turned off by default! Enabling this feature may be detected as automated behavior by Xiaohongshu, potentially resulting in account risk control or banning.

💻 Secondary Development

If you have other needs, you can perform code calls or modifications based on the comments in example.py!

async def example():
    """通过代码设置参数,适合二次开发"""
    # 示例链接
    demo_link = "https://www.xiaohongshu.com/explore/XXX?xsec_token=XXX"

    # 实例对象
    work_path = "D:\\"  # 作品数据/文件保存根路径,默认值:项目根路径
    folder_name = "Download"  # 作品文件储存文件夹名称(自动创建),默认值:Download
    name_format = "作品标题 作品描述"
    user_agent = ""  # User-Agent
    cookie = ""  # 小红书网页版 Cookie,无需登录,可选参数,登录状态对数据采集有影响
    proxy = None  # 网络代理
    timeout = 5  # 请求数据超时限制,单位:秒,默认值:10
    chunk = 1024 * 1024 * 10  # 下载文件时,每次从服务器获取的数据块大小,单位:字节
    max_retry = 2  # 请求数据失败时,重试的最大次数,单位:秒,默认值:5
    record_data = False  # 是否保存作品数据至文件
    image_format = "WEBP"  # 图文作品文件下载格式,支持:AUTO、PNG、WEBP、JPEG、HEIC
    folder_mode = False  # 是否将每个作品的文件储存至单独的文件夹
    image_download = True  # 图文作品文件下载开关
    video_download = True  # 视频作品文件下载开关
    live_download = False  # 图文动图文件下载开关
    download_record = True  # 是否记录下载成功的作品 ID
    language = "zh_CN"  # 设置程序提示语言
    account_archive = True  # 是否将每个作者的作品存至单独的文件夹
    read_cookie = None  # 读取浏览器 Cookie,支持设置浏览器名称(字符串)或者浏览器序号(整数),设置为 None 代表不读取

    # async with XHS() as xhs:
    #     pass  # 使用默认参数

    async with XHS(
        work_path=work_path,
        folder_name=folder_name,
        name_format=name_format,
        user_agent=user_agent,
        cookie=cookie,
        proxy=proxy,
        timeout=timeout,
        chunk=chunk,
        max_retry=max_retry,
        record_data=record_data,
        image_format=image_format,
        folder_mode=folder_mode,
        image_download=image_download,
        video_download=video_download,
        live_download=live_download,
        download_record=download_record,
        language=language,
        read_cookie=read_cookie,
        account_archive=account_archive,
    ) as xhs:  # 使用自定义参数
        download = True  # 是否下载作品文件,默认值:False
        # 返回作品详细信息,包括下载地址
        # 获取数据失败时返回空字典
        print(await xhs.extract(demo_link, download, index=[1, 2]))

📋 Read Clipboard

The project uses pyperclip to implement clipboard reading functionality, which varies across different systems.

On Windows, no additional modules are needed.

On Mac, this module makes use of the pbcopy and pbpaste commands, which should come with the os.

On Linux, this module makes use of the xclip or xsel commands, which should come with the os. Otherwise run "sudo apt-get install xclip" or "sudo apt-get install xsel" (Note: xsel does not always seem to work.)

Otherwise on Linux, you will need the qtpy or PyQT5 modules installed.

⚙️ Configuration File

The settings.json file in the root directory of the project is automatically generated on the first run and allows customization of some runtime parameters.

If invalid parameter values are set, the program will use the default values!

Parameter Type Description Default Value
work_path str Root path for saving works data/files Project root path
folder_name str Name of the folder for storing works files Download
name_format str #Format of works file name, separated by spaces between fields, supports fields: 收藏数量评论数量分享数量点赞数量作品标签作品ID作品标题作品描述作品类型发布时间最后更新时间作者昵称作者ID 发布时间 作者昵称 作品标题
user_agent str Browser User Agent Built-in Chrome User Agent
cookie str RedNote web version cookie, No login required, non essential parameters! None
proxy str Set program proxy null
timeout int Request data timeout limit, in seconds 10
chunk int Size of data chunk to fetch from the server each time when downloading files, in bytes 2097152(2 MB)
max_retry int Maximum number of retries when requesting data fails 5
record_data bool Whether to save works data to a file, saved in SQLite format false
image_format str Download format for image works files, supported: AUTOPNGWEBPJPEGHEIC
Some works do not have files in HEIC format, and the downloaded files may be in WEBP format
When set toAUTO, it represents dynamic format, and the actual format depends on the server's response data
PNG
image_download bool Switch for downloading image works files true
video_download bool Switch for downloading video works files true
live_download bool Switch for downloading animated image files false
folder_mode bool Whether to store each works files in a separate folder; the folder name matches the file name false
download_record bool Do record the ID of successfully downloaded works? If enabled, the program will automatically skip downloading works with records true
account_archive bool Whether to save each author's works into a separate folder; The folder name is authorID_nickname false
language str Set program language. Currently supported: zh_CN, en_US zh_CN

name_format instructions (Currently only supports Chinese values) :

Additional Notes: The parameters user_agent examples are provided for reference; Strongly recommend setting according to actual browser information!

🌐 Cookie

Starting from version 2.2, if there are no abnormalities in project functionality, there is no need to handle cookies separately!

  1. Open the browser (optional: start in incognito mode) and visit https://www.xiaohongshu.com/explore
  2. Log in to your RedNote account (can be skipped)
  3. Press F12 to open the developer tools
  4. Select the Network tab
  5. Check Preserve log
  6. In the Filter input box, enter cookie-name:web_session
  7. Select the Fetch/XHR filter
  8. Click on any piece of works on the RedNote page
  9. In the Network tab, select any data packet (if no packets appear, repeat step 7)
  10. Copy and paste the entire Cookie into the program or configuration file

🗳 Download Records

XHS-Downloader will store the IDs of downloaded works in a database. When downloading the same works again, XHS-Downloader will automatically skip the file download (even if the works file does not exist). If you want to re-download the works file, please delete the corresponding works ID from the database and then use XHS-Downloader to download the works file again!

This feature is enabled by default. If it is turned off, XHS-Downloader will check if the file exists. If the file exists, it will skip the download!

# 📦 Build of Executable File Guide This guide will walk you through forking this repository and executing GitHub Actions to automatically build and package the program based on the latest source code! --- ## Steps to Use ### 1. Fork the Repository 1. Click the **Fork** button at the top right of the project repository to fork it to your personal GitHub account 2. Your forked repository address will look like this: `https://github.com/your-username/this-repo` --- ### 2. Enable GitHub Actions 1. Go to the page of your forked repository 2. Click the **Settings** tab at the top 3. Click the **Actions** tab on the right 4. Click the **General** option 5. Under **Actions permissions**, select **Allow all actions and reusable workflows** and click the **Save** button --- ### 3. Manually Trigger the Build Process 1. In your forked repository, click the **Actions** tab at the top 2. Find the workflow named **构建可执行文件** 3. Click the **Run workflow** button on the right: - Select the **master** or **develop** branch - Click **Run workflow** --- ### 4. Check the Build Progress 1. On the **Actions** page, you can see the execution records of the triggered workflow 2. Click on the run record to view detailed logs to check the build progress and status --- ### 5. Download the Build Result 1. Once the build is complete, go to the corresponding run record page 2. In the **Artifacts** section at the bottom of the page, you will see the built result file 3. Click to download and save it to your local machine to get the built program --- ## Notes 1. **Resource Usage**: - GitHub provides free build environments for Actions, with a monthly usage limit (2000 minutes) for free-tier users 2. **Code Modifications**: - You are free to modify the code in your forked repository to customize the build process - After making changes, you can trigger the build process again to get your customized version 3. **Stay in Sync with the Main Repository**: - If the main repository is updated with new code or workflows, it is recommended that you periodically sync your forked repository to get the latest features and fixes --- ## Frequently Asked Questions ### Q1: Why can't I trigger the workflow? A: Please ensure that you have followed the steps to **Enable Actions**. Otherwise, GitHub will prevent the workflow from running ### Q2: What should I do if the build process fails? A: - Check the run logs to understand the cause of the failure - Ensure there are no syntax errors or dependency issues in the code - If the problem persists, please open an issue on the [Issues page](https://github.com/JoeanAmier/XHS-Downloader/issues) ### Q3: Can I directly use the Actions from the main repository? A: Due to permission restrictions, you cannot directly trigger Actions from the main repository. Please use the forked repository to execute the build process

♥️ Support the Project

If XHS-Downloader has been helpful to you, please consider giving it a Star ⭐. Thank you for your support!

微信(WeChat) 支付宝(Alipay)
微信赞助二维码 支付宝赞助二维码

If you are willing, you may consider making a donation to provide additional support for XHS-Downloader!

🌟 Contribution Guidelines

Welcome to contributing to this project! To keep the codebase clean, efficient, and easy to maintain, please read the following guidelines carefully to ensure that your contributions can be accepted and integrated smoothly.

Reference materials:

✉️ Contact the Author

Other Open Source Projects by the Author:

💰 Sponsor

PyCharm logo

JetBrains support active projects recognized within the global open-source community with complimentary licenses for non-commercial development.

⚠️ Disclaimer

Before using the code and functionalities of this project, please carefully consider and accept the above disclaimer. If you have any questions or disagree with the statement, please do not use the code and functionalities of this project. If you use the code and functionalities of this project, it is considered that you fully understand and accept the above disclaimer, and willingly assume all risks and consequences associated with the use of this project. # 💡 Project References * https://github.com/encode/httpx/ * https://github.com/tiangolo/fastapi * https://github.com/textualize/textual/ * https://github.com/omnilib/aiosqlite * https://github.com/thewh1teagle/rookie * https://github.com/carpedm20/emoji/ * https://github.com/asweigart/pyperclip * https://github.com/lxml/lxml * https://github.com/yaml/pyyaml * https://github.com/pallets/click/ * https://github.com/encode/uvicorn * https://github.com/Tinche/aiofiles