Surviving survival analysis with Apache Spark

*
Proposal
Short Form
Intermediate

Excerpt

Learn about survival analysis in Apache Spark and some questions it can help answer. For instance, what proportion of individuals can be affected by a phenomenon, at what rate will they be affected, how certain events affect the probability of survival.
http://trustedanalytics.github.io/

Description

Survival analysis involves the modeling of time to event data. An event in this context could be the occurrence of a phenomenon or any experience of interest with time being observed from the beginning of the event period.

The Cox Proportional Hazards Model is a very popular survival analysis model used to estimate the relative risk, rather than absolute risk, in common data science tools like R and SAS. The main challenge in implementing this model in a distributed framework is to come up with an efficient algorithm that minimizes scans over the sorted data.

This talk aims to:
• Give an overview on a distributed and big-data centric implementation of the Cox Model in Apache Spark- which is a fast large-scale data processing engine
• Demonstrate how the Cox Model can be used to generate useful insights with medical data. For example, to determine if a vaccination program is ‘better’ at treating a certain demographic of individuals than another, or to map the ‘effect’ of age on reversion to drug use.

http://trustedanalytics.github.io/

Speaking experience

Conducted multiple knowledge sharing talks internally at Intel Corporation.
Conducted a webinar - https://www.brighttalk.com/webcast/10773/192209
I start presenting at 19:07 mins

Speaker

  • Img 6662

    Anahita Bhiwandiwalla

    Intel Corporation

    Biography

    Anahita is a Software Engineer in Intel’s Big Data Solutions group, currently working on the Trusted Analytics Platform. She holds a Master’s degree in Computer Science from Columbia University specialized in Machine Learning. Her main interests are in Machine Learning, Natural Language Processing, Speech Recognition and Data Mining. She is very inclined towards applying various machine learning principles to real world applications and solving challenges that arise as the data scales.

Leave a private comment to organizers about this proposal