Skip to main content

FFM-embedding-v2(incl. v2.1), FFM-embedding

Embedding models can transform complex text into a set of vectors, converting it into a more manageable and understandable form while retaining key information. This helps in tasks such as text analysis, keyword analysis, and simple text content classification.

FFM-embedding creates a vector dataset for a knowledge base by converting each word into a sequence of numbers and defining each dimension of the vector to correspond to a word. The distance between two vectors can measure the relationship between words: a shorter distance indicates a high correlation, while a longer distance indicates a low correlation. This enables the computer to understand the degree of relevance between words, aiding in model training.

FFM-embedding-v2 enhances the compatibility with OpenAI API and extends context length, offering improved semantic processing capabilities for Traditional Chinese. FFM-embedding-v2 also enables more accurate retrieval of sentences within the Traditional Chinese semantic space. Users can flexibly configure parameters through the API to enhance the accuracy of embeddings and adjust dimension settings to optimize the utilization of vector storage space.

FFM-embedding-v2.1 is an enhanced version of v2, further trained on Traditional Chinese legal texts. It provides more accurate semantic judgment in legal context Q&A tasks and outperforms v2 in both MTEB and DRCD evaluations for Traditional Chinese and English.

All FFM-embedding models can be deployed in both AFS ModelSpace public mode and private mode.

  • For instructions on AFS ModelSpace public mode, please refer to this document.
  • For instructions on AFS ModelSpace private mode, please refer to this document.