Formosa Foundation Model Licensing List
注意事項
- Inference setting: Inference Quantization = FP16,KV Cache Quantization = FP16/BF16,Batch size = 1
- GPU memory includes model weights, activations, KV cache, and framework overhead.
FFM 模型系列 | 模型型號 | 序列長度 | GPU記憶體最小需求 (單一模型佈署) |
---|---|---|---|
Llama3.3-FFM | Llama3.3-FFM-70B | 32K* | 198 GB |
Llama3.2-FFM | Llama3.2-FFM-11B-V | 32K* | 52 GB |
Llama3.1-FFM | Llama3.1-FFM-8B | 32K* | 35 GB |
Llama3.1-FFM-70B | 32K* | 185 GB | |
Llama3.1-FFM-405B | 32K* | 915 GB | |
Llama3-FFM | Llama3-FFM-8B | 8K | 27 GB |
Llama3-FFM-70B | 8K | 165 GB | |
FFM-Mistral | FFM-Mistral-7B | 32K | 34 GB |
FFM-Mixtral-8x7B | 32K | 48 GB | |
FFM-Llama2-v2 | FFM-Llama2-v2-7B | 4K | 17 GB |
FFM-Llama2-v2-13B | 4K | 30 GB | |
FFM-Llama2-v2-70B | 4K | 154 GB | |
FFM-Llama2 | FFM-Llama2-7B | 4K | 17 GB |
FFM-Llama2-13B | 4K | 30 GB | |
FFM-Llama2-70B | 4K | 154 GB | |
FFM-embedding | FFM-embedding-v2.1 | 8K | 2 GB |
FFM-embedding-v2 | 8K | 2 GB | |
FFM-embedding | 2K | 2 GB | |
FFM-BLOOMZ | FFM-BLOOMZ-7B | 4K | 20 GB |
FFM-BLOOMZ-176B | 4K | 389 GB |
**Sequence Length can be 128K while the GPU memory has to be reestimated.