Automatic model selection and parameter selection with the Trusted Analytics Platform

*
Proposal
Short Form
Intermediate

Excerpt

Trusted Analytics Platform (TAP) is an open source software, optimized for performance and security, that accelerates the creation of cloud-native applications driven by Big Data Analytics. This talk uses TAP to address two very common questions arising in data science– ‘Which model best fits my data?’ and ‘How do I find the optimal parameters for my models?’
http://trustedanalytics.github.io/

Description

Trusted Analytics Platform (TAP) is an open source software, optimized for performance and security, that accelerates the creation of cloud-native applications driven by Big Data Analytics.

This talk uses TAP to address two very common questions arising in data science– ‘Which model best fits my data?’ and ‘How do I find the optimal parameters for my models?’ Automating these processes with pipelines can save a lot of time compared to manually building, training, and testing models before deciding on the best one. The model pipelines can also be easily replicated and shared by users in the cloud.

This talk aims to demonstrate the following capabilities of TAP:
- Highly extensible and developer friendly plugin architecture
- Ease of creation and use of model pipelines

The first part of this talk presents a mechanism to perform automated algorithm selection given the data, by selecting the best algorithm from a specific class of algorithms in TAP. Once the best algorithm for the data is determined, we train the model and run a prediction/evaluation on the data and present the user with its analysis thus creating an end to end pipeline.

The second part of this talk presents a mechanism to select optimal hyper parameters that best fit the data, using an end to end pipeline. The use case for these demonstrations will be clustering.

http://trustedanalytics.github.io/

Speaking experience

Conducted multiple knowledge sharing talks internally at Intel Corporation.
Conducted a webinar - https://www.brighttalk.com/webcast/10773/192209
I start presenting at 19:07 mins

Speaker

  • Img 6662

    Anahita Bhiwandiwalla

    Intel Corporation

    Biography

    Anahita is a Software Engineer in Intel’s Big Data Solutions group, currently working on the Trusted Analytics Platform. She holds a Master’s degree in Computer Science from Columbia University specialized in Machine Learning. Her main interests are in Machine Learning, Natural Language Processing, Speech Recognition and Data Mining. She is very inclined towards applying various machine learning principles to real world applications and solving challenges that arise as the data scales.

Leave a private comment to organizers about this proposal