Holden Karau

348 Followers

Software Development Engineer with experience in "big data" and search. Highlight of Achievements: * Apache Spark Committer * Received IBM OTAA award and Google Open Source Peer Bonus for work on Apache Spark * Author of Fast Data Processing With Spark & co-author of Learning Spark & co-author of High Performance Spark *Updated linux kernel wireless drivers - Recipient of Xandros Outstanding evaluation * Proposed, developed and implemented source code search engine: All The Code - success of tool positively reviewed at Ottawa & Guelph DemoCamps and on Slashdot *Created plt-scheme web application feature on Slashdot and other media Skills and Proficiencies: Progra...

apache spark spark python big data pyspark scala apache beam testing apache flink open source machine learning apache arrow distributed systems software testing structured streaming pydataconf pydata tensorflow ml scaling datasets dataframes performance software validation kubernetes validating spark validating beam beam python functional programming debugging streaming spark sql testing spark java elasticsearch code reviews validation fosdem data validation data pipelines k8s country diversity gender diversity diversity signal kubeflow data parallel fun tuning auto tuning gender jupyter flink curry on 2018 committer apache sadness computers silliness arrow spark python apache kafka dask julia spark flink python testing apache spark testing apache beam r debuggng linuxconfau2018 linuxconfau linuxconf oss relational programming spark dataframes meetup tech talk streaming machine learning streaming ml scale spark testing base ibm japan pandas sparklingpandas search

Präsentationen
Dokumente
Infografiken

Aktuellste Beliebteste SlideShares

Holden Karau

A Gentle Introduction to Locality Sensitive Hashing with Apache Spark

A New Year in Data Science: ML Unpaused

Lessons from Running Large Scale Spark Workloads

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015

Why your Spark job is failing

Spark the next top compute model