2.2 For Programmers

In the second background, the common underlying story changes a little bit. You are someone who knows how to program and probably does this for a living. You are familiar with one or more languages and can easily switch between them. You’ve heard about this new flashy thing called “data science” and you want to jump on the bandwagon. You begin to learn how to do stuff in numpy, how to manipulate DataFrames in pandas and how to plot things in matplotlib. Or maybe you’ve learned all that in R by using the tidyverse and tibbles, data.frames, %>% (pipes) and geom_*

Then, from someone or somewhere you become aware about this new language called “Julia.” Why bother? You are already proficient in Python or R and you can do everything that you need. Well, let us contemplate some plausible scenarios.

Have you ever in Python or R:

1. Done something and where unable to achieve the performance that you needed? Well, in Julia Python or R minutes can be translated to seconds3. We reserved Section 2.4 for displaying successful Julia use cases in both academia and industry.

2. Tried to do something different than numpy/dplyr conventions and discovered that your code is slow and you’ll probably have to learn dark magic4 to make it faster? In Julia you can do your custom different stuff without loss of performance.

3. Had to debug code and somehow you see yourself reading a Fortran or C/C++ source code and having no idea what you are trying to accomplish? In Julia you only read Julia code5, no need to learn another language to make your original language fast. This is called the “two-language problem” (see Section 2.3.2). It also covers the use case for when “you had an interesting idea and wanted to contribute to an open source package and gave up because almost everything is not in Python or R but in C/C++ or Fortran”6.

4. Wanted to use a data structure defined in another package and found that doesn’t work and that you’ll probably need to build an interface7. Julia allows users to easily share and reuse code from different packages. Most of Julia user-defined types and functions work right out of the box8 and some users marvelled upon discovering how their packages are being used by other libraries in ways that they could not have imagined. We have some examples in Section 2.3.3.

5. Needed to have a better project management, with dependecies and version control tightly controlled, manageable and replicable? Julia has an amazing project management solution and a great package manager. Unlike traditional package managers, which install and manage a single global set of packages, Julia’s package manager is designed around “environments”: independent sets of packages that can be local to an individual project or shared between projects. Each project maintains its own independent set of package versions. We’ll talk more about how to manage your projects and packages in the Appendix 8.2.

Let’s proceed then!

1. 3. and sometimes milliseconds.↩︎

2. 4. numba, or even Rcpp or cython?↩︎

3. 5. no C++ or FORTRAN API calls.↩︎

4. 6. have a look at some deep learning libraries in GitHub and you’ll be surprised that Python is only 25%-33% of the codebase.↩︎

5. 7. this is most a Python ecosystem problem, while R doesn’t suffer heavily from this is not blue skies either.↩︎

6. 8. or with little effort necessary.↩︎

CC BY-NC-SA 4.0 Jose Storopoli, Rik Huijzer and Lazaro Alonso