Four weeks after its establishment, it raised 105 million euros, and the full text of the European language model mistral.ai's financing memorandum

2023-06-25 03:30:09

Source: Empower Labs

Image source: Generated by Unbounded AI tool

A team that has only been established for a few weeks has completed a financing of 105 million euros with no products, no users, and no operational experience. This memo (memo) helped it convince Light Speed, former Google CEO Eric Schmidt and others. The memo emphasizes the European market, AI security, compliance and other aspects. Mistral believes that their use of an open source route that is completely different from OpenAI will eventually allow him to establish advantages and achieve surpassing. From what I read, this memo is clearly written very skillfully, and it also contains some bluff elements. It made good use of the current FOMO mentality of the European society on the big language model to complete the financing.

Mistral is willing to refer to a dry and strong northwest cold wind in southern France, and it is also the name of a French-made amphibious assault ship. This is the world's leading amphibious assault ship. The name embodies French pride. The six members of the founding team are all from France. Instead of understanding it as a large European language model, I think it is more like a large French language model company. He tells a good European story, but it won't be the only one in Europe.

I saw Memo in a discussion group. After confirming that the content of Memo no longer needs to be kept secret, I used ChatGPT to translate the full text of it, and then proofread and re-translated part of the content.

mistral.ai Strategic Memo

Author: mistral.ai

Translation: ChatGPT, Wang Chao

Generative AI is a transformative technology

In the last year, we've seen a phenomenal acceleration in generative AI (systems capable of generating text/images from text and images). These systems can help humans:

● Produce excellent and innovative content (text, code, graphics)

● Read, process and summarize unstructured streams of content thousands of times faster than humans

● Interact with the world through natural language or APIs to execute workflows faster than ever.

The powerful capabilities of generative AI were suddenly revealed to the public after the release of ChatGPT. Such products are being produced by only a few small teams around the world, and the limited number of researchers in these teams has become a bottleneck preventing the creation of a new economy in this field.

Generative AI is about to increase productivity across all industries and create a new industry by seamlessly augmenting the machine capabilities of the human mind ($10 billion market in 2022, projected to reach $110 billion by 2030, projected annual growth rate of 35%). It is a transformative technology for the world economy that will change the nature of work and bring about positive social change.

Oligopoly in the making

Generative AI techniques stand on the basis of years of research in industry and academia. By scaling up the training to Internet-scale data and correcting the model with human feedback, the breakthroughs that made the technology accessible to the masses were achieved by a handful of industry players, the largest of which (OpenAI) seems to have hegemonic intent on the market.

These few players train generative models and use them as assets; they serve thousands of third parties who create products for productivity improvements, as well as the general public through their own products like chatbots. A large number of third-party startups are still being formed to build various services based on these generative models.

**We believe that most of the value in the emerging generative AI market comes from the hard-to-make technology, the generative models themselves. **These models need to be trained on thousands of powerful machines, processing trillions of data from high-quality sources, which constitutes the first high bar. The second important barrier is the difficulty of building an experienced team, and mistral.ai is in a good position to do this.

Currently (GLM) all the major players are located in the US, there is not yet a serious competitor in Europe. Given how powerful (and dangerous) this new technology is, this is a major geopolitical question. mistral.ai will be the European leader in AI that increases productivity and creativity and guides the coming new industrial revolution.

Current generative AI does not meet market needs

OpenAI and its current competitors have chosen a closed technology route, which will significantly limit their market coverage. In this approach, the model is kept private and only served through a text-to-text API. This raises the following important questions for business:

● Organizations wishing to use generative AI techniques are forced to provide their valuable business data and sensitive user data to a black box model, often deployed in the public cloud. This poses a security problem: a model that is kept secret cannot be checked to ensure that its output is safe, and such a model cannot be deployed in a security-critical application. This situation also raises legal issues, especially when a company transfers personal data outside its legal borders and may be subject to extraterritorial laws.

● Exposing only the output of the model, rather than the full model, makes it harder to interface with other components (retrieval database, structured input, images and sound). There are currently hundreds of products that create composite capabilities (e.g., memory, vision, etc.) by interconnecting the outputs and inputs of models. These products will work better and faster if the model can be provided as a white box (transparent model) (such as The Flamingo integrates the visual and text models of the white box into a text+visual model).

● The data used to train the model is confidential, which means we rely on systems of uncertain origin and that may produce uncontrollable output. Filtering efforts to address this issue provide only weak and fragile guarantees that the model will not output sensitive content that it may have been trained to do. This issue led to ChatGPT being banned in Italy in April 2023.

Break the market pattern from Europe

By founding mistral.ai, we plan to take a completely opposite stance from current closed models to train advanced models. **Our vision is to become a leading player in the field while integrating these models in Europe and the wider industry to develop a high value business. **

**mistral.ai will be a research leader in generative AI and within four years the leading provider of AI technology on the market. **To achieve this goal, we will first focus on a few key differentiating characteristics, and then conduct a comprehensive R&D effort to select the most effective strategies to move towards artificial intelligence that is of practical value to humans.

Focusing on the European market first will give us a defensive advantage, and our open stance on the technology route will further enhance our attractiveness. Many of the brightest minds in the field of Large Language Modeling (LLM) are European; our extensive experience shows that many of them would like to join our project.

Opposite Technical Positioning

Our early differentiators, the blind spots in our competitors' strategies, were the following:

● **Take a more open approach to model development. **We will release the model under a permissive open-source-software licence, which will substantially outperform the competition. We will release tools to harness the power of these white-box models and create a developer community around our brand. This approach is ideologically very different from OpenAI, this will better attract top researchers, and it will be a powerful acceleration for the development of the project, because it will provide a lot of downstream Enthusiastic developers open the door. This will increase our business development scope. We will balance our open source strategy with financial interests, reserving the most powerful and professional models for paying users.

○ We will dedicate 1% of the funds to non-profit foundations responsible for open source community development.

● Whether open source or licensed, the internals (architecture and trained weights) of our models are always open to our customers. **This will allow tighter integration with clients' workflows, their content can be fed into different parts of the deep model, rather than having everything serialized as input text, fed to a black box API. **

● **Increased focus on data provenance and data control. **Our models will be trained on high quality data content (other than scraped content) for which we will negotiate a license agreement. This will allow us to train better models than currently available models such as Llama. Using deep engagement techniques (hybrid experts and retrieval-augmented models), we will provide models with optional data source access: for paid premium users, specific models can be dedicated to finance/legal/etc (this provides a considerable performance boost ). Using similar techniques, our model will be able to provide instant differentiated data access for employees with different corporate intellectual property rights.

● **Provide unrivaled security and privacy guarantee. **Our model will be deployable in a private cloud and optionally directly on-device, effectively minimizing privacy concerns by eliminating potentially problematic processes. To this end, we will direct our R&D efforts toward training small yet super-efficient models, effectively proposing models with the highest quality/cost ratio on the market. Our open-source strategy will also ensure the auditability of our models when deployed to key industries, especially dual and health.

Business Development

In terms of business, we will provide the most valuable technical modules for the emerging AI-as-a-service industry, and use generative AI to completely change the business workflow. We will co-build integrated solutions with European integrators and industrial customers and get extremely valuable feedback from them to become the main tool for all companies looking to leverage AI in Europe.

Integration with verticals can take different marketplace forms, including full access licenses to models (including trained weights), specialization of models based on demand, commercial contracts with integrators/consulting firms to build fully integrated solutions . As detailed in our roadmap, we will explore and identify best approaches as the technology evolves.

How to become a leader in the field of AI

Top Team

The founding team consists of top researchers in the field who have worked at DeepMind and Meta, as well as experienced French serial entrepreneurs and influential public leaders.

● Arthur Mensch — CEO — Former Chief Research Scientist at DeepMind, lead author of several major contributions to LLM: Chinchilla, Retro, Flamingo

● Guillaume Lample — Chief Scientist — Former Meta Senior Research Scientist. Led the Llama project, Meta's major contribution to the field of large language models

● Timothée Lacroix — CTO — Former software engineer at Meta, technical lead at Llama

● Jean-Charles Samuelian ，Alan CEO

● Charles Gorintin，Alan CTO

● Cédric O, former French Secretary of State for Digital Affairs

The first five employees already identified will be experienced researchers from large tech companies. Their enthusiasm for Europe and the concept of open source, and the continuous organizational restructuring of some companies due to the rapid development of generative AI, also constitutes an appropriate time for them to leave these companies.

Infrastructure and Data Sources

To train a competitive model, an exa-scale cluster needs to be used for at least several months. We intend to rent such computing resources for a full year, thereby developing open source and commercial models of different capabilities.

We are already conducting competitive negotiations with top cloud service providers on renting computing resources (we plan to start in summer and form a computing reserve of 1536 H100 by September). Since mistral.ai has a strong European base, we will also cooperate with emerging European cloud service providers who are actively expanding deep learning computing services.

We've trained large-scale models before, which provided us with the expertise to train 10-100x faster than publicly available methods - our founders and early employees clearly knew how to Train the strongest model with a given computational budget.

Our early investors are also content providers in Europe, and will open all the necessary doors for us to acquire high-quality datasets that we can train and fine-tune our models on.

Explore scenarios together with key customers

The founding team is already organizing commercial explorations with major French and European commercial institutions. A small product-oriented team (6 people by the end of the year) will start growing the business while the technical team trains valuable technical modules.

The modeling team will remain 100% focused on technology development to avoid distractions.

Business development will begin concurrently with the development of the first generation model family, using the following strategies:

● Focused exploration of the needs of large industrial players, facilitated by 3rd party integrators who will be given full access to our best (non-open source) models

● Co-design products with some small emerging partners focused on generative AI products.

Business-based exploration will be used to drive the design of the second generation model.

route map

the first year

We will train two generations of models, and the model development and commercial integration will advance simultaneously. The first generation will be partially open source, relying on the technology the team has mastered. It will validate our ability to meet the needs of our clients, investors and institutions. The second-generation model will address significant shortcomings of the current model, allowing it to be safely and economically used by businesses.

Train the best open source standard model

By the end of 2023, we will train a series of text generation models that can significantly outperform ChatGPT 3.5 and the March 2023 version of Bard, as well as all open source solutions.

This series will be open source; we will participate in the community to build on it, making it an open standard.

We will provide the same service interface as our competitors and charge a fee to collect third-party usage data, and we will create some free consumer applications to expand brand influence and capture first-party user data.

Customized and differentiated for business needs

Over the next six months, these models will be equipped with semantic embedding models for content search, and multimodal plugins for handling visual input. Ad hoc models retrained using commercially available high-quality data sources will also be prepared.

Commercial development will start concurrently with the development of the first-generation model series: we intend to have a proof-of-concept integration by the end of the first quarter of 2024.

In terms of technology, in the first and second quarters of 2024, we will focus on two main areas that are undervalued by incumbent companies:

● Train a model small enough to run on a 16GB laptop while serving as a useful AI assistant

● Train models with hot-swappable extra contexts**, allowing up to millions of extra contexts, effectively merging language models and retrieval systems.

At the same time, training and fine-tuning datasets will continue to be enriched through partnerships and data acquisition.

By the end of Q2 2024, we intend to:

● Distributing the best open source text generative model, with text and visual output

● Has a generic and expert model with one of the highest value/cost ratios

● Provide model capabilities for third-party integrators through scalable and diverse available APIs

● Establish a licensed commercial relationship with one or two large industry players who have committed to use our technology

Next Stage

Competing with and surpassing players such as OpenAI will require substantial investment in later stages (GPT-4 cost several hundred million dollars). Our goal for the first year is to demonstrate that we are one of the strongest teams in the global AI competition, able to develop and launch models that can compete with the biggest players. Our experience as large-scale language model (LLM) researchers will allow us to be more capital efficient at an early stage than companies that are discovering or moving into this field.

One north star of mistral.ai will be security: we will release models in a well-staged manner, making sure our models are only used for purposes consistent with our values, and for this we will provide the "red team" beta access to spot inappropriate behavior and correct them.

In doing so, we will convince key public and private institutions that we can build safe, controllable, and efficient technologies that allow humanity to benefit from this scientific breakthrough. And this will attract institutions and countries to participate in our Series A financing. In Series A (Q3 2024), we expect to need to raise $200 million to train models beyond the capabilities of GPT-4.

Strong financial backing will allow us to train models on a much larger infrastructure, strengthening our position as a leader in AI research and the supplier of choice in the European industry sector.

(full text)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

Topic
#TOKEN OF LOVE IS BACK
20k Popularity
#BTC Market Cap Tops Amazon
12k Popularity
#Show My Alpha Points
98k Popularity
#Crypto Market Cap Hits ATH
2k Popularity
#Predict BTC's Bull or Bear Trend
8k Popularity

sitemap