Activeloop, a California-based startup, has announced that it has secured $11 million in series A funding to enhance the enterprise utilization of multimodal data for AI. The funding comes from investors such as Streamlined Ventures, Y Combinator, Samsung Next, and others. Activeloop, founded by Davit Buniatyan, aims to streamline AI projects by offering a dedicated database called Deep Lake.
One of the biggest challenges enterprises face today is leveraging unstructured multimodal data for training AI models. Activeloop’s Deep Lake technology addresses this challenge by allowing teams to create AI applications at a lower cost compared to market offerings while increasing engineering teams’ productivity by up to five-fold.
The importance of this work is evident as more and more enterprises seek ways to tap into their complex datasets for AI applications in various use cases. According to McKinsey research, generative AI has the potential to generate trillions of dollars in global corporate profits annually across different areas such as customer support interactions, creative content generation, and software code drafting.
Activeloop’s Deep Lake helps enterprises deal with petabyte-scale unstructured data that covers various modalities such as text, audio, and video. Traditionally, this task requires teams to identify relevant datasets from disorganized silos and integrate them with different storage and retrieval technologies. This process involves a lot of coding and increases the project’s cost. Activeloop standardizes this approach with Deep Lake by storing complex data in the form of machine learning-native mathematical representations (tensors). It also facilitates the streaming of these tensors to visualization engines or deep learning frameworks.
Deep Lake offers all the benefits of a data lake but stands out by converting data into the tensor format that deep learning algorithms expect as inputs. The tensors are stored in cloud-based object storage or local storage and seamlessly streamed to graphics processing units (GPUs) for training. This approach ensures that GPUs are fully utilized and eliminates the need for copying data in batches.
Activeloop has gained traction in the enterprise segment, with its open-source project being downloaded over one million times. Fortune 500 companies in highly regulated industries such as biopharma, life sciences, medtech, automotive, and legal are leveraging Activeloop’s enterprise-centric offering. One customer, Bayer Radiology, used Deep Lake to unify different data modalities and streamline data pre-processing time, enabling data scientists to query scans in natural language.
With the recent funding, Activeloop plans to further develop its enterprise offering and attract more customers to its AI database. The company also intends to use the funds to scale up its engineering team. The upcoming release of Deep Lake v4 will include faster concurrent IO, the fastest streaming data loader for training models, complete reproducible data lineage, and external data source integrations.
Ultimately, Activeloop aims to save enterprises from spending millions on in-house solutions for data organization and retrieval while increasing engineers’ productivity by reducing manual work and coding. With the increasing demand for AI applications in enterprises, Activeloop’s innovative approach to handling multimodal data could have a significant impact on the industry.