1. MACHINE LEARNING WITH SCALA SUSAN ERALY SKYMIND ANDY PETRELLA DATA FELLAS 2. MACHINE LEARNING Not Terminator! (despite our name) Applications are everywhere. …
  • 2. MACHINE LEARNING Not Terminator! (despite our name) Applications are everywhere.  OCR  Netflix recommendations
  • 3. NEURAL NETS & DEEP LEARNING A Perceptron/Neuron can be loosely compared to a NAND gate Non linear functions can be constructed With more compute and with big data…
  • 4. DEEPLEARNING4J dl4j is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. Skymind, is it’s commercial support arm
  • 5. SCIENTIFIC COMPUTING & THE JVM Problems when considering HPC • Vectorization • Array indexing, 32 bit address space FULLY native backend d4j JAVACPP OpenMP CUDA
  • 6. MICRO-SERVICES + ML? Kinda like micro-services Reduce lock in Take math, data cleaning, model training, choosing algorithms ... … and separate them
  • 7. SGD: SERIAL VS. PARALLEL Model Training Data Worker 1 Master Partial Model Global Model Worker 2 Partial Model Worker N Partial Model Split 1 Split 2 Split 3 …
  • 8. MAP REDUCE VS. PARALLEL ITERATIVE Input Output Processor Processor Processor Superstep 1 Processor Processor Superstep 2 . . . Processor
  • 9. NLP AND DL • Topic Modeling/Sentiment Analysis • Machine Translation • Question Answer NLP is hard “The best part of the movie is the end credits” “It should have been a great movie…”
  • 10. RECURRENT NN  Loops  Temporal behavior  Used for temporal series
  • 11. LSTMS -The solution to exploding and vanishing gradients
  • 12. APPLICATIONS - Sequence to Sequence Credits, Andrej Karpathy
  • 13. WORD2VEC -Word embeddings that represent meaning/context King – Man + Woman ~ Queen
  • 14. WORD2VEC
  • 15. APPLICATIONS - Sequence to Sequence Second Lord: They would be ruled after this chamber, and my fair nues begun out of the fact, to be conveyed, Whose noble souls I'll have the heart of the wars. Clown: Come, sir, I will make did behold your worship. VIOLA: I'll drink it. Credits, Andrej Karpathy
  • 16. SENTIMENT ANALYSIS Sentiment Review
  • 17. DATA FELLAS Spark-Notebook is the only Scala based notebook. It is scalable and enables interactive work on Spark, Akka, Cassandra, Kafka and can plot interactive plots on any Scala type. Data Fellas enables data-driven business, bringing productivity to data science in enterprise.
