FFM 設計指南
本文件中文版尚在籌備中,敬請期待。
1. Introduction
This model design guide will help you build a plausible FFM (Formosa Foundation Model) to meet your needs.
The FFM (Formosa Foundation Model)
FFM is a fine-tuned LLM (Large Language Model) with many features:
- Based on Bloom and Llama2 models, these LLMs consist of 176B and 70B neuron parameters for various application and cross language understanding.
- Adjusted LLM architecture and optimized training procedures for supercomputer Taiwania 2.
- Increasing the Traditional Chinese and Southeast Asia corpus to 30%, comparing to the original 0.3%, to better understand local culture and knowledge.
- Highly relevant response tuning according to human feedback.
- Has the lowest carbon footprint comparing to other LLM applications.1
- Supports on-premise deployment while meeting local compliance.
As a result, FFM can be used as a foundation model for continuous training to meet your business needs.
Currently, AFS (AI Foundry Service) provides two pre-trained models for you to choose from:
RELEASE | MODEL NAME | DESCRIPTION |
---|---|---|
2023 Sep | FFM-Llama2-70B | General Domain |
2023 May | FFM-Bloom-176B | General Domain |
You can leverage FFM to build applications within your business, like:
- Draft documents according to previous routine procedures
- Write computer code based on existing code base
- Generate answers according to company knowledge base
- Analyze bunches of confidential documents
- Tutor in bussiness process subjects
For other scenarios, please contact sales@twsc.io for further information.
In this guide, we will assist you in preparing a domain-specific language model (see Data Preparation), explain every AFS (AI Foundry Service) step in Model Building, and guide you to evaluate the self-owned model based on FFM in Inferencing.
1 Training a Bloom-based model consumes 433 MWh of electric power (comparing to GPT-3's 1,287 MWh)(BigScience Workshop, 2023)2, and it continues training on Taiwanina 2 (Power usage effectiveness: 1.2).
2 Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., ... & Manica, M. (2022). Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.