Today when enterprises want to use machine learning to predict future outcomes — what will a customer buy next? will sales increase next month? is a piece of manufacturing equipment likely to fail soon? — it takes a village to develop a machine learning based solution. As the demand from businesses to leverage machine learning continues growing at an exponential rate, the current time-intensive process that heavily relies on highly-skilled ML experts won’t suffice.
To make matters worse, the solution usually only addresses the problem immediately at hand. This means that when a new problem or slight variant of the current one arises, the lengthy process must be repeated all over again.
In reality, this way of working is not sustainable or scalable. There will always be new problems to solve. But current methodologies and a shortage of ML experts do not allow us to tackle them at an efficient rate.
So we decided to create and launch a new paradigm for building ML solutions. Our mission is to allow data scientists and business analysts to deploy trusted models for machines to leverage any prediction challenge.
Feature Labs introduces ML 2.0
Machine Learning 2.0 (ML 2.0) defines practical steps for creating ML products and services that any team can follow. It enables companies to tackle projects in a systematic way using automated data science tools. The end result is not only a much faster route to deployment, but more involvement from subject matter experts across multiple layers of an enterprise.
Enterprises are already using the principles of ML 2.0 to smash through the obstacles stifling the potential of ML. BBVA used Feature Labs to improve their ability to detect credit card fraud by 2x, while Accenture is using Feature Labs to anticipate critical maintenance events for more than 500 software applications a month. These are just two examples of building accurate ML models in a fraction of the time usually required for new problems that companies face today.
Machine Learning, Streamlined
In our recent paper, “Machine Learning 2.0: Engineering Data Driven AI Products,” we lay out all of the details that a team needs to turn raw data into a trustworthy, deployable model by following the seven key steps described below.
Step 1: Data Organization
Start by organizing the raw data as well as adding metadata and annotations. This enables the same data representation to be used everywhere. Consistency is key. If the underlying raw data changes, the effects will be known immediately.
Step 2: Prediction Engineering
Prediction engineering is the process of specifying a prediction problem so it can be used to programmatically label data for ML. Prediction engineering enables you to systematically use ML to solve problems that matter for your business.
Step 3: Automated Feature Engineering
Feature engineering is the process of using domain knowledge and human intuition to extract new variables from raw data that make ML algorithms work. This is a necessary step for any ML product to function. Feature engineering is very challenging because it is tedious, time-consuming, and error-prone. It’s now possible to automate this process using Deep Feature Synthesis.
Step 4: Modeling and Optimization
Tools for ML models have been around for a decade or more. After an initial end-to-end solution is built, automatic algorithms can be used for optimization instead of alternative ML algorithms. Important note: Be cautious about over-optimization too early in the process – engineering the right prediction problem and set of features generally has a much greater impact on the end result.
Step 5: Integration Into Production
The ML 2.0 APIs allow for easy integration into production. In production, as new raw data is added, it propagates through to new predictions in the same way as development.
Step 6: Validate in Production
No matter how meticulously the model is evaluated during the training process, a truly robust practice also involves evaluating a model in the production environment. This typically translates to doing a “stealth deployment,” where the predictions are compared against live outcomes but not used to impact decision-making.
Step 7: Deploy and Operationalize
With absolute confidence in the model, we can now benefit from it by using the resulting predictions to solve complex issues and improve enterprise performance.
Machine Learning That Matters
In the vast amounts of data that every organization collects, there is untapped, transformative potential. But knowing how to leverage it has been difficult — until now. ML 2.0 enables virtually any enterprise in any industry to capitalize on these unforeseen advantages.
As recently described in Harvard Business Review, “Today, businesses don’t just want to have answers to questions like: Did we meet our sales target for this quarter? Did we reach our target audience? Did our advertisement spend meet its objectives? Instead they want to know what is likely to happen in the future. They want to make data-driven predictive decisions, quickly and easily, which is the promise of ML 2.0.”
With this new paradigm in tow, the only question left for companies to answer is, “What do you want to accomplish with your data?”