跳至主要内容

Tool Call (Function Call)

注意事項

Meta-Llama 原生模型不支援以下 AFS Tool Call。 如有需要在 Meta-Llama 原生模型使用 tool call,請參考 Meta Llama tool use support

信息

Tool call 支援模型清單(新格式)

  • Llama3.3-FFM-70B
  • Llama3.1-FFM (8B、70B)
  • Llama3-FFM (8B、70B)
  • FFM-Llama2-v2 (7B 、 13B 、70B)

Function call支援模型清單(舊格式)

  • FFM-Mixtral (8x7B)
  • FFM-Mistral (7B)
  • FFM-Llama2-v2 (7B 、 13B 、70B)

Meta 推出新一代大語言模型 Llama 3.3,經過 TWSC技術團隊調整過後,有700 億參數的版本可以與函式一起使用,由大語言模型判斷是否呼叫函式。如果請求中包含一個或多個函式,則模型會根據提示的上下文決定是否需要呼叫函式。當模型確定應該使用某個函式時,會以該函式參數的格式化資料(JSON)來進行輸出。

模型是基於所提供的函式,再解析意圖後,輸出對應的 API 與結構化資料。特別注意的是,模型只挑選出適用的函式,但並不會進行函式的操作,函式呼叫是由「應用端」所實作的業務邏輯來控制

函式的使用可以分為三個步驟:

  1. 提供函式並輸入使用者問題來呼叫 FFM Conversation API,取得函式呼叫的資訊。
  2. 使用模型輸出的函式資訊來呼叫對應的 API 或函式,並取得執行結果。
  3. 再次呼叫 FFM Conversation API,將第 2 步驟所取得的執行結果一併傳入模型推論服務中,以便獲得總結。
信息

Parallel function calling(目前 Llama3.3-FFM-70B 與 Llama3.1-FFM-70B/8B支援) Parallel function calls 允許輸出多個函式呼叫,進而可以並行執行和檢索結果。這樣可以減少 API 呼叫次數,來提高整體效能。


使用方式

Conversation API 提供更完整的格式來完善 Function Calling 功能,舊格式 Request body 中的 functions 欄位,以及 Response 中的 function_call 欄位,未來將被棄用,後面的章節會描述新格式的使用方式。

在 API 呼叫中,您可以描述多個函式讓模型選擇,並輸出包含選中的函數名稱及參數的 JSON Object,讓應用或代理人程式調用模型選擇的函式。Conversation API 不會調用該函式而是生成 JSON 讓您可在代碼中調用函式。

信息

目前 LLMBackend 可以往前相容。不論 FFM-Llama2 還是 Llama3-FFM 模型,使用舊版 function call 格式會得到舊版 functaion call response;使用新版 function call 格式,會得到新版 function call response。


步驟一:透過 Conversation API 參數 tools 傳遞 function calling 資訊

基於 Conversation API 中的參數 tools 來選擇適當的函式並解析對應的參數。

  • 參數 tools 只支持新格式 Tool Call
  • 參數 tools 為 array 格式,內容主要為函式所對應的 JSON Schema 描述,其中包含兩個必要參數。
    • type
    • function
FieldTypeRequiredDescription
toolsarrayOptionalJSON 格式的函式列表
Properties
typestringRequired

目前僅支援 function

functionobjectRequired

namestringRequired
函式名稱,必須是 a-z、A-Z、0-9,或是包含底線(_)或連接號(-)。
descriptionstringOptional
函式功能的描述,模型根據描述選擇何時呼叫函式。
parametersobjectOptional
函式的輸入參數,使用 JSON Schema 來描述。用法可以參考此 JSON Schema reference 連結。
requiredstringOptional
指定了在呼叫函式時必須提供的參數。

  • 參數 tool_choice 為 string 或 object 格式,非必要參數,主要用來指定函式呼叫的情境。當有提供函式時,此欄位預設為 "auto",無函式時,預設值為 "none"

    • "none":不執行函式呼叫的功能,而是文字生成。

    • "auto":由模型自行決定輸出為函式呼叫或是文字生成。

      • 在此模式下,可透過回傳欄位 finish_reason 來判別模型的輸出,若是 "finish_reason": "tool_calls" 則為函式呼叫,非 "tool_calls" 則是文字生成。
    • {"type": "function", "function": {"name": "my_function"}} 指定某 function 的函式呼叫。

      • 在此模式下,因為已經明確指定要輸出函式呼叫,所以 finish_reason 是一般像 eos_token 等提示,並 不會"tool_calls",這部分由應用端自行解析內容來判別。


      FieldTypeRequiredDescription
      tool_choicestring or objectOptional指定函式呼叫的情境
Possible Types
string

- none 不執行函式呼叫,輸出為一般的文字生成。
- auto 由模型決定輸出為函式呼叫或是文字生成。

object

- ❴"type":"function":❴"name":"my_function"❵❵ 指定函式,強制模型輸出指定的函式呼叫。

properities

typestringRequired

目前僅支援 function

functionobjectRequired

函式屬性

namestringRequired

函式名稱


Request 使用範例 (不使用 tool_choice)

Non-Streaming
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93
},
"stream": false
}'
Streaming
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93
},
"stream": true
}'

使用 tool_choice 的 Request 範例

Use auto
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93
},
"tool_choice": "auto",
"stream": false
}'
Use none
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93
},
"tool_choice": "none",
"stream": false
}'
Specifies a function
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93
},
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather"
}
},
"stream": false
}'

步驟二:Conversation API 參數 tools 傳遞 function calling 資訊的 Response

大語言模型回傳函式呼叫的結果

FieldType
tool_callsarray
Possible Types
typestring

目前僅支援 function

idstring

函式呼叫識別碼

functionobject

為包含函式名稱、參數值的函式呼叫內容。


Response 範例

Response with Non-Streaming
{
"tool_calls": [
{
"type": "function",
"id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "1.17 sec",
"prompt_tokens": 141,
"generated_tokens": 43,
"total_tokens": 184,
"finish_reason": "tool_calls"
}
Response with Streaming
data: {"generated_text": "", "tool_calls": [{"index": 0, "type": "function", "id": "call_afc9227158e6458798d789ab1f84c920", "function": {"name": "get_current_weather", "arguments": ""}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "{\""}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "location"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "Boston, MA"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "\","}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "unit"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "c"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "elsius"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": "\"}"}}], "details": null, "finish_reason": null}

data: {"generated_text": "", "tool_calls": [{"index": 0, "function": {"arguments": ""}}], "details": null, "total_time_taken": "1.17 sec", "prompt_tokens": 141, "generated_tokens": 43, "total_tokens": 184, "finish_reason": "tool_calls"}

Response with tool_choice by auto
{
"generated_text": "",
"tool_calls": [
{
"type": "function",
"id": "call_fe97cf6c20ae4b00b88b660b853d93d9",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "1.16 sec",
"prompt_tokens": 135,
"generated_tokens": 43,
"total_tokens": 178,
"finish_reason": "tool_calls"
}

Response with tool_choice by none
{
"generated_text": "As of my last update, the weather in Boston was quite chilly with temperatures around 40°F (4°C) and some light rain. However, it's always a good idea to check the latest weather forecast before heading out, as conditions can change quickly.",
"details": null,
"total_time_taken": "1.41 sec",
"prompt_tokens": 18,
"generated_tokens": 53,
"total_tokens": 71,
"finish_reason": "stop_sequence"
}
Response with tool_choice by specifies a function
{
"tool_calls": [
{
"type": "function",
"id": "call_7JK8LIPTho7DffbvceTV5Oey",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "0.82 sec",
"prompt_tokens": 159,
"generated_tokens": 18,
"total_tokens": 177,
"finish_reason": "eos_token"
}

步驟三:再次呼叫 FFM Conversation API,將第二步驟所取得的執行結果進行總結

大語言模型將函式執行後的結果,以容易理解的方式來輸出。這個步驟屬於多輪對話的情境,除了要提供之前的歷史對話紀錄,還需要將執行函式的結果,放在 role 為 toolcontent 欄位中。

Fieldvalue
roletool
tool_call_id引用 tool_calls 中的函式呼叫識別碼
content函式呼叫的執行結果

基於第二步驟結果執行呼叫 Weather API

執行函式呼叫天氣 API,代入 "Boston, MA" 地方參數,以及 "celsius" 單位參數。

  • Example of Weather API Response
{  
"temperature": "22",
"unit": "celsius",
"description": "Sunny"
}

使用範例

將執行呼叫天氣 API 所獲得的結果整合到 role 為 toolcontent 欄位中。

Request
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\":\"Boston, MA\", \"unit\": \"celsius\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"content": "{\"location\": \"Boston, MA\", \"temperature\": \"22\", \"unit\": \"celsius\"}"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.5,
"top_k": 100,
"top_p": 0.93,
}
}'
Response
{
"generated_text": "The current temperature in Boston, MA is 22 degrees Celsius.",
"details": null,
"total_time_taken": "0.43 sec",
"prompt_tokens": 250,
"generated_tokens": 14,
"total_tokens": 264,
"finish_reason": "stop_sequence"
}

Extra Parameters

注意事項

僅在 OpenAI 相容的 Chat Completion API 上有支援, OpenAI Completions API 跟 AFS Conversation API 尚不支援此功能。

enable_grammar:用來確保 tool call回傳的 JSON 格式。

FieldTypeRequireRequire
enable_grammarbooleanOptional僅支援 compatible with OpenAI's API,並搭配 tool call 使用,用來確保函式呼叫回傳的 JSON 格式,可以符合 JSON 資料結構規範(JSON Schema)。
false: 不啟用此功能,此為預設值。
true: 啟用此功能。

使用範例

Setting enable_grammar to true by curl
   export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/chat/completions" \
-H "accept: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "llama3.1-ffm-8b-32k-chat",
"messages": [
{
"role": "user",
"content": "Calculate the area of a circle with a radius of 5 units."
}
],
"tools": [
{
"type": "function",
"function": {
"name": "geometry_calculate_area_circle",
"description": "Calculate the area of a circle given its radius",
"parameters": {
"type": "object",
"properties": {
"radius": {
"type": "number",
"description": "The radius of the circle."
},
"unit": {
"type": "string",
"description": "The measurement unit of the radius (optional parameter, default is 'units')."
}
},
"required": [
"radius"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"stream": false,
"enable_grammar": true
}'
Setting enable_grammar to true by openai SDK
from openai import OpenAI
import os
API_KEY = "{API_KEY}"
API_URL = "{API_URL}"
client=OpenAI(
api_key=API_KEY,
base_url=API_URL + "/models"
)

response = client.chat.completions.create(
model="llama3.1-ffm-8b-32k-chat",
messages=[{"role": "user", "content": "What is the weather like in Boston?"}],
tools=[
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
temperature=0.01,
max_tokens=512,
extra_body={
"enable_grammar": True
}


)
data = response.model_dump_json()
print(data)

舊格式的 function call request (will be deprecated)

舊格式的 function call 方式未來將被棄用,請使用新格式來呼叫。

開發者提供函式列表並對大語言模型輸入問題

FieldTypeRequireDescription
functionsarrayOptionalA list of functions the model may generate JSON inputs for.
Example of RESTful HTTP Request
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}],
"functions": [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}],
"parameters": {
"max_new_tokens": 500,
"frequence_penalty": 1,
"temperature": 0.5,
"top_k": 100,
"top_p": 0.93
},
"stream": false
}'

舊格式 Response 範例

FieldTypeRequireDescription
function_callstring or objectOptionalJSON format that adheres to the function signature
Example of RESTful HTTP Response
{
"generated_text": "",
"function_call": {
"name": "get_current_weather",
"arguments": {
"location": "Boston, MA",
}
},
"details":null,
"total_time_taken": "1.18 sec",
"prompt_tokens": 181,
"generated_tokens": 45,
"total_tokens": 226,
"finish_reason": "function_call"
}


舊格式的總結方式

將 API 回傳的結果放到對話內容並傳給大語言模型做總結。

Fieldvalue
rolefunction
nameThe function name to call
contentThe response message from the API
Example of RESTful HTTP Request
export API_KEY={API_KEY}
export API_URL={API_URL}
export MODEL_NAME={MODEL_NAME}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "content-type: application/json" \
-d '{
"model": "'${MODEL_NAME}'",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
},
{
"role": "assistant",
"content": null,
"function_call":
{
"name": "get_current_weather",
"arguments": {"location": "Boston, MA"}
}
},
{
"role": "function",
"name": "get_current_weather",
"content": "{\"temperature\": \"22\", \"unit\": \"celsius\", \"description\": \"Sunny\"}"
}
],
"functions": [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}],
"parameters": {
"max_new_tokens": 500,
"frequence_penalty": 1,
"temperature": 0.5,
"top_k": 100,
"top_p": 0.93
},
"stream": false
}'
Example of RESTful HTTP Response
{
"generated_text": " The current weather in Boston is sunny with a temperature of 22 degrees Celsius. ",
"details": null,
"total_time_taken": "0.64 sec",
"prompt_tokens": 230,
"generated_tokens": 23,
"total_tokens": 253,
"finish_reason": "eos_token"
}