Mosaic AI — Answer to full Machine Learning Lifecycle

sarvesh chand
5 min readJul 14, 2021

--

In this post, we will walk through the entire machine learning (ML) lifecycle and show how to architect and build an ML use case end to end using LTI Mosaic AI. This platform provides a rich source of resources to help the data scientists, machine learning engineers, and developers to prepare, build, train, and deploy ML models rapidly and with ease. For our use case, we shall be looking into the Synthetic Financial Datasets For Fraud Detection — https://www.kaggle.com/ealaxi/paysim1

Technical Solution overview:

  • Data Collection via Connectors
  • Intelligent Assisted Wrangling
  • Notebook Offerings
  • Model Development
  • Model Deployment
  • Model Monitoring
  • Explainable AI
Mosaic AI Overview

Data Collection Process:

The platform offers a list of connectors to various enterprise storage devices to scan and publish metadata to catalog layer. The data from different sources could be then published to catalog. Metadata catalog helps in identifying and locating data useful for the current project. The dataset published from catalog gets added in project and is displayed in the Dataset tab.

List of enterprise storage offerings.

For our use case, as we are working on the Kaggle dataset, we can have the dataset downloaded and pulled in our platform.

Onboarding data into the project

Assisted Wrangling:

Working with a dataset is a cake walk on Mosaic AI. With more than 300 functions assisting in the preprocessing and feature engineering layer, the user gets a beautiful and hassle free experience in getting familiarized with the data.

Data Prep — Lens

Notebook Templates:

Mosaic AI provides a number of development environment — Jupyter, Hub, R studio, Zeppelin and many more. The user has the option of creating his/her own template by choosing the tools and the instances based on the size and volume of the dataset. This can be linked to supported repositories.

For our use case, we spun a jupyter notebook instance with Python3.6 and came up with few observations:

  • Fraudulent transactions have been found when the amount getting transferred from source to destination is not same. (mostly during Cash Out Transactions)
  • Destination amount shows up to be 0, for transaction type as Transfers and not Cash Out.

Model Development:

Mosaic AI provides capabilities like bringing your own model(byom) and also provides very good support for developing models . The SDK provided, could be used for connecting to the dataset, creating various dataframes, coming up with score functions and finally registering the model.

Model Deployment:

Once the model gets registered, we can see it in the Model section. Multiple runs of the model creates multiple versions. The platform provides the ability of comparing one version against the other. This is how the deployed model looks like:

Mosaic AI provides several deployment strategies like A/B deployment, Canary, and many more. The deployed model provides a REST endpoint which could be used for making request-response calls.

Model Monitoring:

The model could be monitored for any kind of drift it observes due to data or change in the usage of data. Each requests in the model can be tracked with their status as success or failure. The platform provides logging capabilities at multiple places to debug the failures at pod level as well as easy tracking of CPU and Memory utilization.

Explainable AI (XAI):

The is the most interesting phase of the Machine Learning cycle. Mosaic AI provides this feature in three sections:

  • Overview : Here you get to know the model in terms of performance matrices, the feature importance, PDPs and others.
  • Know Your Data: It talks about the possibility of data drift and the impact it could bring to the model. This could be visualized as a line chart, histograms, word cloud and bar charts.
  • What If : This analysis guides us with the risk profiles. Like changing the weightage of some features could lead to a high risk or low risk situation.

Apart from all the above features Mosaic AI also supports AutoML, scheduling and workflow creation processes which is out of scope of this blog.

The team at Mosaic AI is constantly working on the latest advancements which Data Science has been witnessing over time. Continuous improvements could be seen over the platform in user interface, the model monitoring and explainability part.

Conclusion:

In this blog we went over the use case of Fraud detection dataset and how Mosaic AI helped us in the entire Machine Learning workflow. Right from data collection to Explainable AI .

Link: https://mosaic.lntinfotech.com/mosaic-ai/

Please refer to one of the nice reads from my colleague on Explainable AI:

https://medium.com/@shivanandpawar/explainable-ai-new-jargon-in-the-town-181bac1e8bf

--

--

Responses (1)