DataPI Studio
Visual workflow designer that brings data science and machine learning to the entire analytics team, from analysts to experts.
Visual Workflow Designer
Increase productivity across the entire data science team, from analysts to experts
- Accelerate the creation of predictive models using a drag + drop visual interface
- Rich library of over 1500 machine learning algorithms and functions to build the best model for any use case
- Pre-built templates for common use cases including customer churn, predictive maintenance, fraud detection, and many more
- Unique Wisdom of Crowds feature provides proactive recommendations at each step of the workflow, including the population of parameters


Data Sources
Connect to all of your data, no matter where it lives
- Create point + click connections to databases, warehouses, cloud sources, documents, social media, and business applications
- Connect to new data sources by downloading extensions from the DataPI.
In-database Processing
Run data prep and ETL processes directly inside databases
- Query and retrieve data without writing complex SQL
- Harness the power of highly scalable database clusters
- Supports MySQL, PostgreSQL, and Google BigQuery


Explore and Visualize Data
Evaluate data health, completeness, and quality
- Explore data using robust statistical overviews and over 30 interactive visualizations
- Develop an understanding of patterns, trends, and distributions in the data with scatter plots, histograms, line charts, parallel coordinates, box plots, and more.
- Identify and fix common data quality problems including missing values and outliers
Data Prep and Blending
Eliminate the hassle of preparing data for predictive modeling
- Extract, join, filter, and group data across any number of sources
- Create repeatable data prep and ETL processes that can be scheduled and shared


Machine Learning
Create robust machine learning models without writing code
- Choose from hundreds of supervised and unsupervised machine learning algorithms
- Implement a wide variety of ML techniques including regression, clustering, time-series, text analytics, and deep learning
- Use both automated and manual feature engineering to improve model accuracy.
Model Validation
Understand the true performance of a model before deploying to production
- Eliminate overfitting through a unique approach that prevents model training pre-processing data from leaking into the application of the model
- Add proven techniques, like cross validation, to a model with just a single mouse click


Explainable Models Not Black Boxes
Create visual data science workflows that are easy to explain and easy to understand
- Each step in the data prep, modeling, and validation process is documented for complete transparency
- Visual workflow is easy to explain to others in the organization
- Supports the Local Interpretable Model-Agnostic Explanations (LIME) framework
Flexible Scoring and Deployment
Turn predictive insights into prescriptive actions
- Quickly deploy scored data to spreadsheets and data visualization tools
- Turn models into production web services with DataPI Server
- Add DataPI Real-Time Scoring for demanding high transaction / low latency use cases


Automation and Process Control
Build sophisticated visual workflows and automate important tasks
- Use process control operators to create workflows that repeat and loop over tasks, branch flows and access system resources
- Supports a variety of scripting languages for custom integrations and automatons
- Schedule processes
Open and Extensible
Integrate with existing applications and code
- Use existing R and Python code and libraries to extend DataPI
- Download new functionality through the DataPI Marketplace
- DataPI Studio open core is available under an aGPL license
