跳转至

版本说明书

版本配套说明

产品版本信息

产品名称

MindIE LLM

产品版本

3.0.0

版本类型

正式版本

维护周期

三个月

相关产品版本配套说明

产品名称

版本

CANN

8.5.1

Ascend Extension for PyTorch

7.3.0

Ascend HDK

版本配套关系参见CANN版本配套说明
说明:CANN 8.5.1和CANN 8.5.0配套的HDK版本一致。

版本兼容性说明

各组件需要配套使用,请勿跨版本混用。

表 1 软件版本兼容性说明

MindIE CANN MindCluster Ascend Extension for PyTorch CCAE
3.0.0 8.5.1 7.3.0 7.3.0 iMaster CCAE V100R026C00SPC010

版本使用注意事项

无。

3.0.0 更新说明

新增特性

编号

特性

具体内容

1

功能

  • 新增`response_format`结构化输出能力,支持请求参数response_format,可选类型为json_object/json_schema。#Issue257
  • 推理请求input size 超长时,支持从左边截断,保留最新的历史上下文。#Issue252
  • PD分离场景支持chat接口/completions接口使用停用词功能,支持请求参数stop/stop_token_ids/include_stop_str_in_output。#Issue226
  • 支持按照请求级维度指定chat template。#Issue250
  • 支持enable_thinking并设置thinking_budget。#Issue240
  • 大EP场景P节点支持chunk prefill与Prefix cache、MTP、Tool calls特性叠加。#Issue205

2

性能提升

  • 新增支持SLO感知调度优化特性,基于设定TPOT约束动态调整BatchSize。#Issue253

3

可靠性

  • 支持虚推异常故障恢复。#Issue122
  • 支持显存异常预警。#Issue165
  • 新增服务侧`/metrics`长序列分档与prefix cache命中统计指标,增强可观测性与运行状态分析能力。#Issue189

4

易用性

修改特性

编号

特性

详细

1

功能优化

  • BeamSearch规格优化,支持maxBeamWidth=8192 maxBatchSize=819200。
  • 请求输出结果默认跳过special_tokens,即`skip_special_token`默认值为`True`,与资料对齐。#Issue288

删除特性

编号

特性

详细

1

推理后端

  • MindIE LLM仓已删除适配MindSpore后端的相关代码和资料文档。#Issue291

日落特性

说明:以下特性将于一年后日落。

编号

特性

详细

1

部署形态

  • MindIE LLM将不再支持基于run包的部署方式。Torch 2.1.0版本将随run包日落。abi0软件包将随run包日落。

2

量化

  • W4A16量化特性。

3

测试脚本

  • ModelTest和Benchmark工具将日落,归一至AISBench工具。

4

模型

  • 以下模型将日落:MiniCPM-1B、MiniCPM-2B、MiniCPM3-4B、Starcoder2、StableLM、yizhao、chatglm2-6b、chatglm3-6b、chatglm3-6b-32K、Qwen系列、Qwen1.5系列、Qwen2系列、Qwen2 Coder、Hunyuan、Skywork、DBRX、grok-1、Llama3.2、llava-1.5、llava-1.6、Yi-VL、VITA-1.5

接口变更说明

本章节的接口变更说明包括新增、修改、废弃和删除。接口变更只体现代码层面的修改,不包含文档本身在语言、格式、链接等方面的优化改进。

  • 新增:表示此次版本新增的接口。
  • 修改:表示本接口相比于上个版本有修改。
  • 废弃:表示该接口自作出废弃声明的版本起停止演进,且在声明一年后可能被移除。
  • 删除:表示该接口在此次版本被移除。

类名/API原型/配置项

变更类别

变更说明

  • `POST /v1/chat/completions`
  • `POST /v1/completions`
  • `GET /metrics`

新增/修改

  • 新增`response_format`结构化输出参数。#Issue358
  • 新增`chat_template`动态指定。#Issue250
  • function call场景允许`role=tool`消息`content`为空。#Issue184
  • `/metrics`补充长序列与prefix cache相关指标,返回头为`text/plain; charset=utf-8`。#Issue189
  • 模型配置

修改

  • Torch-Like组图日落,`graph: "python"`不再生效,将自动回退至CPP组图。#Issue212
  • 环境变量

删除

  • 旧版日志环境变量删除,详情见#PR98
  • `MINDIE_LLM_FRAMEWORK_BACKEND`环境变量随MS后端删除。#Issue291

已解决的问题

序号

问题详情

1

多模态超长请求被服务拦截后,未正确释放共享内存。

2

文件缺失报错时输出文件路径。

3

资料更新`prefillPolicyType`和`decodePolicyType`仅支持配置未0。

遗留问题

序号

遗留问题

规避手段

1

缩P保D,D实例强制上下电,P实例卡上进程未能及时退出,导致缩P保D不符合预期,恢复时长为10分钟。

不涉及。

2

lite-moe 910B3,pro-moe 910B4评测任务接口调用失败,模型日志有报错,服务未重启。

该场景下主动拒绝服务,增加错误日志,用户可观测。

3

beamsearch场景性能在ubuntu镜像和openeuler镜像存在差异,E2E时延在openeuler上出现劣化。

OS差异导致,beamsearch场景推荐使用ubuntu镜像。

4

DeepSeekV3.1模型,A2大EP定长输入场景下,部分request rate > 0的性能测试场景TTFT出现劣化

1) 下调request rate 5%,可复现原有TTFT; 2) 定长输入场景修改大EP启动脚本中`conf/ms_corrdinator.json`中`select_type`字段为`1`,关闭dp负载均衡,可复现原有TTFT。

升级影响

升级过程中对现行系统的影响

  • 对业务的影响 软件版本升级过程中会导致业务中断。

  • 对网络通信的影响 对网络通信无影响。

升级后对现行系统的影响

漏洞修补列表

软件名称 软件版本 CVE编号 实际CVSS得分 漏洞描述 解决版本
Transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14921 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.The specific flaw exists within the parsing of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14924 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.The specific flaw exists within the parsing of checkpoints. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current process. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14930 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.The specific flaw exists within the parsing of weights. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current process. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14920 0 A vulnerability, which was classified as critical, has been found in Hugging Face transformers (affected version not known).Using CWE to declare the problem leads to CWE-502. The product deserializes untrusted data without sufficiently verifying that the resulting data will be valid.Impacted is confidentiality, integrity, and availability.There is no information about possible countermeasures known. It may be suggested to replace the affected object with an alternative product. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14929 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.The specific flaw exists within the parsing of checkpoints. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current process. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14926 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must convert a malicious checkpoint.The specific flaw exists within the convert_config function. The issue results from the lack of proper validation of a user-supplied string before using it to execute Python code. An attacker can leverage this vulnerability to execute code in the context of the current user. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14927 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must convert a malicious checkpoint.The specific flaw exists within the convert_config function. The issue results from the lack of proper validation of a user-supplied string before using it to execute Python code. An attacker can leverage this vulnerability to execute code in the context of the current user. MindIE 3.0.0
transformers 4.30.2,4.33.0,4.33.1,4.34.1,4.35.0,4.36.0,4.36.2,4.37.0,4.37.1,4.37.2,4.38.2,4.39.0,4.40.0,4.42.0,4.42.4,4.43.1,4.43.2,4.44.0,4.46.2,4.49.0,4.51.0 CVE-2025-14928 0 This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must convert a malicious checkpoint.The specific flaw exists within the convert_config function. The issue results from the lack of proper validation of a user-supplied string before using it to execute Python code. An attacker can leverage this vulnerability to execute code in the context of the current user. MindIE 3.0.0
jinja2 3.1.3,3.1.4 CVE-2024-56201 5.4 Jinja is an extensible templating engine. In versions on the 3.x branch prior to 3.1.5, a bug in the Jinja compiler allows an attacker that controls both the content and filename of a template to execute arbitrary Python code, regardless of if Jinja_x27;s sandbox is used. To exploit the vulnerability, an attacker needs to control both the filename and the contents of a template. Whether that is the case depends on the type of application using Jinja. This vulnerability impacts users of applications which execute untrusted templates where the template author can also choose the template filename. This vulnerability is fixed in 3.1.5. MindIE 3.0.0
jinja2 3.1.3,3.1.4 CVE-2024-56326 5.4 Jinja is an extensible templating engine. Prior to 3.1.5, An oversight in how the Jinja sandboxed environment detects calls to str.format allows an attacker that controls the content of a template to execute arbitrary Python code. To exploit the vulnerability, an attacker needs to control the content of a template. Whether that is the case depends on the type of application using Jinja. This vulnerability impacts users of applications which execute untrusted templates. Jinja_x27;s sandbox does catch calls to str.format and ensures they don_x27;t escape the sandbox. However, it_x27;s possible to store a reference to a malicious string_x27;s format method, then pass that to a filter that calls it. No such filters are built-in to Jinja, but could be present through custom filters in an application. After the fix, such indirect calls are also handled by the sandbox. This vulnerability is fixed in 3.1.5. MindIE 3.0.0
Angular 0.1.2 CVE-2025-66035 0 Angular is a development platform for building mobile and desktop web applications using TypeScript/JavaScript and other languages. Prior to versions 19.2.16, 20.3.14, and 21.0.1, there is a XSRF token leakage via protocol-relative URLs in angular HTTP clients. The vulnerability is a Credential Leak by App Logic that leads to the unauthorized disclosure of the Cross-Site Request Forgery (XSRF) token to an attacker-controlled domain. Angular_x27;s HttpClient has a built-in XSRF protection mechanism that works by checking if a request URL starts with a protocol (http:// or https://) to determine if it is cross-origin. If the URL starts with protocol-relative URL (//), it is incorrectly treated as a same-origin request, and the XSRF token is automatically added to the X-XSRF-TOKEN header. This issue has been patched in versions 19.2.16, 20.3.14, and 21.0.1. A workaround for this issue involves avoiding using protocol-relative URLs (URLs starting with //) in HttpClient requests. All backend communication URLs should be hardcoded as relative paths (starting with a single /) or fully qualified, trusted absolute URLs. MindIE 3.0.0
torch 2.1.0 CVE-2025-32434 0 PyTorch is a Python package that provides tensor computation with strong GPU acceleration and deep neural networks built on a tape-based autograd system. In version 2.5.1 and prior, a Remote Command Execution (RCE) vulnerability exists in PyTorch when loading a model using torch.load with weights_only=True. This issue has been patched in version 2.6.0. MindIE 3.0.0
torch 2.1.0 CVE-2025-3730 0 A vulnerability, which was classified as problematic, was found in PyTorch 2.6.0. Affected is the function torch.nn.functional.ctc_loss of the file aten/src/ATen/native/LossCTC.cpp. The manipulation leads to denial of service. An attack has to be approached locally. The exploit has been disclosed to the public and may be used. The real existence of this vulnerability is still doubted at the moment. The name of the patch is 46fc5d8e360127361211cb237d5f9eef0223e567. It is recommended to apply a patch to fix this issue. The security policy of the project warns to use unknown models which might establish malicious effects. MindIE 3.0.0
requests 2.31.0 CVE-2023-32681 6.1 Requests is a HTTP library. Since Requests 2.3.0, Requests has been leaking Proxy-Authorization headers to destination servers when redirected to an HTTPS endpoint. This is a product of how we use rebuild_proxies to reattach the Proxy-Authorization header to requests. For HTTP connections sent through the tunnel, the proxy will identify the header in the request itself and remove it prior to forwarding to the destination server. However when sent over HTTPS, the Proxy-Authorization header must be sent in the CONNECT request as the proxy has no visibility into the tunneled request. This results in Requests forwarding proxy credentials to the destination server unintentionally, allowing a malicious actor to potentially exfiltrate sensitive information. This issue has been patched in version 2.31.0. MindIE 3.0.0
nestjs 0.1.2 CVE-2024-29409 0 File Upload vulnerability in nestjs nest v.10.3.2 allows a remote attacker to execute arbitrary code via the Content-Type header. MindIE 3.0.0
torch 2.1.0 CVE-2024-31580 0 PyTorch before v2.2.0 was discovered to contain a heap buffer overflow vulnerability in the component /runtime/vararg_functions.cpp. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted input. MindIE 3.0.0
torch 2.1.0 CVE-2024-31583 0 Pytorch before version v2.2.0 was discovered to contain a use-after-free vulnerability in torch/csrc/jit/mobile/interpreter.cpp. MindIE 3.0.0
filelock 3.13.2 CVE-2025-68146 6.3 filelock is a platform-independent file lock for Python. In versions prior to 3.20.1, a Time-of-Check-Time-of-Use (TOCTOU) race condition allows local attackers to corrupt or truncate arbitrary user files through symlink attacks. The vulnerability exists in both Unix and Windows lock file creation where filelock checks if a file exists before opening it with O_TRUNC. An attacker can create a symlink pointing to a victim file in the time gap between the check and open, causing os.open() to follow the symlink and truncate the target file. All users of filelock on Unix, Linux, macOS, and Windows systems are impacted. The vulnerability cascades to dependent libraries. The attack requires local filesystem access and ability to create symlinks (standard user permissions on Unix; Developer Mode on Windows 10+). Exploitation succeeds within 1-3 attempts when lock file paths are predictable. The issue is fixed in version 3.20.1. If immediate upgrade is not possible, use SoftFileLock instead of UnixFileLock/WindowsFileLock (note: different locking semantics, may not be suitable for all use cases); ensure lock file directories have restrictive permissions (chmod 0700) to prevent untrusted users from creating symlinks; and/or monitor lock file directories for suspicious symlinks before running trusted applications. These workarounds provide only partial mitigation. The race condition remains exploitable. Upgrading to version 3.20.1 is strongly recommended. MindIE 3.0.0
transformers 4.49.0 CVE-2025-3264 0 A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically in the get_imports() function within dynamic_module_utils.py. This vulnerability affects versions 4.49.0 and is fixed in version 4.51.0. The issue arises from a regular expression pattern \stry\s:.?except.?: used to filter out try/except blocks from Python code, which can be exploited to cause excessive CPU consumption through crafted input strings due to catastrophic backtracking. This vulnerability can lead to remote code loading disruption, resource exhaustion in model serving, supply chain attack vectors, and development pipeline disruption. MindIE 3.0.0
ipaddress 1.0.23 CVE-2021-29921 0 In Python before 3,9,5, the ipaddress library mishandles leading zero characters in the octets of an IP address string. This (in some situations) allows attackers to bypass access control that is based on IP addresses. MindIE 3.0.0
ipaddress 1.0.23 CVE-2020-14422 0 Lib/ipaddress.py in Python through 3.8.3 improperly computes hash values in the IPv4Interface and IPv6Interface classes, which might allow a remote attacker to cause a denial of service if an application is affected by the performance of a dictionary containing IPv4Interface or IPv6Interface objects, and this attacker can cause many dictionary entries to be created. This is fixed in: v3.5.10, v3.5.10rc1; v3.6.12; v3.7.9; v3.8.4, v3.8.4rc1, v3.8.5, v3.8.6, v3.8.6rc1; v3.9.0, v3.9.0b4, v3.9.0b5, v3.9.0rc1, v3.9.0rc2. MindIE 3.0.0

注:实际CVSS得分为0,即产品无实际漏洞攻击场景,不受漏洞影响(代码未编译、代码无调用、编译选项保护等)。