Python is my main programming language at work, it’s the lingua franca for data engineer, together with SQL. I have always been very curious to learn a functional programming language and Scala was a natural choice.

As data engineer, Python, Scala and Java are the most common programming languages, these three technologies dominate the main tools, components and frameworks used by data professionals, for example:

  • Hadoop (Java)
  • Pandas (Python)
  • Airflow (Python)
  • Kafka (Scala and Java)
  • Spark (Scala)
  • Pulsar (Java)

(…) It’s a big list, if you are still not convinced, take a look into Apache Projects Directory for Big Data.

Learning Scala

Scala combines both object-oriented and functional programming in one concise, high-level language. It runs on the top of JVM and the ecosystem is huge, with a seamless Java interop. Since you can use Java libraries in Scala, it makes the ecosystem really powerful.

However, Scala has a not-very-easy learning curve, especially if you want to go deeper into functional programming.

It’s not very easy to find good learning materials about Scala, but I can easily recommend the following courses and ebooks:

  1. Scala & Functional Programming Essentials - Udemy - Daniel is an amazing teacher, he will guide you through the main concepts of functional programming language and you will be able to write your first Scala code.
  2. Learn Scala 3 The Fast Way - Ebook - This ebook is fresh (Scala 3) and great, the chapters are small and you can learn very fast, it’s easy to read this book and have your hands dirty.
  3. Functional Programming, Simplified - Ebook - This ebook is from the same author of Learn Scala 3 the Fast Way, it’s worth if you are struggling to understand functional programming, it’s worth to study this topic and you will learn new skills to make you an even better programmer.

Practice

The best way to learn Scala is writing code. If you are like me, a data engineer, you have a great ecosystem to try your new learning. When I was first studying Scala a couple of years ago, I decided to build my own ingestion pipeline, using Scala as much as I could. It worked like a charm and I was able to learn a lot. I wrote dozens of Spark applications purely in Scala, deployed a Kafka system with a fake-data-generator entirely in Scala. I was able to watch the data flowing and see some results in a small data lake I created on my local computer. I had a lot of fun and could learn a lot! Once again, the best way to learn a new skill is starting a personal project.

Happy coding!