Hackster is hosting Hackster Holidays, Ep. 7: Livestream & Giveaway Drawing. Watch previous episodes or stream live on Friday!Stream Hackster Holidays, Ep. 7 on Friday!

An AI That's Well-Connected

Microsoft's TaskMatrix.AI framework gives foundation models virtually any capability by coordinating interactions with external services.

Nick Bild
2 years agoMachine Learning & AI
High-level overview of the TaskMatrix.AI framework (📷: Y. Liang et al.)

Thanks to recent algorithmic advances, there has been an explosion of interest in a new type of artificial intelligence: large language models. These models, also known as foundational or transformative models, are capable of performing an astonishing range of language-based tasks, from translating between languages to answering complex questions with incredible accuracy.

One of the most notable features of large language models is their ability to understand and make sense of vast amounts of general knowledge about the world. By training on enormous datasets, these models are able to build a deep understanding of a wide range of topics, from science and history to pop culture and current events. This general knowledge allows them to perform tasks that would be impossible for earlier generations of AI, such as summarizing lengthy articles or answering complex questions that require context and nuance.

However, it's important to note that large language models are not without their limitations. While they excel at tasks that involve language and general knowledge, they often struggle with domain-specific tasks that require specialized knowledge or expertise. For example, while a large language model may be able to answer questions about the history of a particular country, it may not be able to perform accurate mathematical calculations, and when it comes to completing physical tasks in the real word, these models are helpless.

Researchers at Microsoft are seeking to make foundation models more capable with their recently announced TaskMatrix.AI framework. TaskMatrix.AI makes it possible to connect a foundation model to millions of external APIs for the completion of specific tasks. The general knowledge about the world contained in the primary model is meant to serve as a brain-like central system for user interaction. This new framework then calls an external machine learning model or other service where domain-specific assistance is needed.

Since there are many existing models and systems available that have been finely tuned to carry out specific functions, this may seem like an obvious next step to take. However, each of these systems has a different implementation or mechanism of operation, so gluing everything together is much easier said than done. Especially when considering the vast number of existing tools that are presently available.

There are several pieces to the TaskMatrix.AI approach. The first component is the Multimodal Conversational Foundation Model (MCFM), which communicates with users and has a wide base of general knowledge. This could be GPT-4, LaMDA, PaLM, or any of a number of foundation models presently in use. Another critical component is the API Platform, which provides the documentation and specifications for the external services that the MCFM can interact with. This provides a consistent, centralized means to store metadata about services, and is intended to be updated by API developers or owners.

With these pieces in place, an API Selector is then needed to help the MCFM choose an appropriate service to contact when its own knowledge or capabilities cannot complete the requested task. Finally, after an API has been selected, the API Executor will send a request to that API using the specification defined in the API Platform. When taken together, these four components would allow a foundation model to interact with any other arbitrary, external systems.

The potential applications for TaskMatrix.AI are virtually endless. From manipulating images, to controlling a smart home or automating the office, the system is only limited by the capabilities of the APIs it can access. However, in order to access those APIs, the API Platform must be populated and maintained manually, which could prove to be the Achilles’ heel of the framework. Unless a large number of developers choose to participate with the project, the MCFM will not gain many new capabilities.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles