3.5 Julia Standard Library

Julia has a rich standard library that is available with every Julia installation. Contrary to everything that we have seen so far, e.g. types, data structures and filesystem; you must load standard library modules into your environment to use a particular module or function.

This is done via using or import. In this book, we will load code via using:

using ModuleName

After doing this, you can access all functions and types inside ModuleName.

3.5.1 Dates

Knowing how to handle dates and timestamps is important in data science. As we said in Why Julia? (Section 2) section, Python’s pandas uses its own datetime type to handle dates. The same is true in the R tidyverse’s lubridate package, which also defines its own datetime type to handle dates. In Julia packages don’t need to write their own dates logic, because Julia has a dates module in its standard library called Dates.

To begin, let’s load the Dates module:

using Dates

3.5.1.1 Date and DateTime Types

The Dates standard library module has two types for working with dates:

  1. Date: representing time in days and
  2. DateTime: representing time in millisecond precision.

We can construct Date and DateTime with the default constructor either by specifying an integer to represent year, month, day, hours and so on:

Date(1987) # year
1987-01-01
Date(1987, 9) # year, month
1987-09-01
Date(1987, 9, 13) # year, month, day
1987-09-13
DateTime(1987, 9, 13, 21) # year, month, day, hour
1987-09-13T21:00:00
DateTime(1987, 9, 13, 21, 21) # year, month, day, hour, minute
1987-09-13T21:21:00

For the curious, September 13th 1987, 21:21 is the official time of birth of the first author, Jose.

We can also pass Period types to the default constructor. Period types are the human-equivalent representation of time for the computer. Julia’s Dates have the following Period abstract subtypes:

subtypes(Period)
DatePeriod
TimePeriod

which divide into the following concrete types, and they are pretty much self-explanatory:

subtypes(DatePeriod)
Day
Month
Quarter
Week
Year
subtypes(TimePeriod)
Hour
Microsecond
Millisecond
Minute
Nanosecond
Second

So, we could alternatively construct Jose’s official time of birth as:

DateTime(Year(1987), Month(9), Day(13), Hour(21), Minute(21))
1987-09-13T21:21:00

3.5.1.2 Parsing Dates

Most of the time, we won’t be constructing Date or DateTime instances from scratch. Actually, we will probably be parsing strings as Date or DateTime types.

The Date and DateTime constructors can be fed a string and a format string. For example, the string "19870913" representing September 13th 1987 can be parsed with:

Date("19870913", "yyyymmdd")
1987-09-13

Notice that the second argument is a string representation of the format. We have the first four digits representing year y, followed by two digits for month m and finally two digits for day d.

It also works for timestamps with DateTime:

DateTime("1987-09-13T21:21:00", "yyyy-mm-ddTHH:MM:SS")
1987-09-13T21:21:00

You can find more on how to specify different date formats in the Julia Dates’ documentation. Don’t worry if you have to revisit it all the time, we ourselves do that too when working with dates and timestamps.

According to Julia Dates’ documentation, using the Date(date_string, format_string) method is fine if it’s only called a few times. If there are many similarly formatted date strings to parse, however, it is much more efficient to first create a DateFormat type, and then pass it instead of a raw format string. Then, our previous example becomes:

format = DateFormat("yyyymmdd")
Date("19870913", format)
1987-09-13

Alternatively, without loss of performance, you can use the string literal prefix dateformat"...":

Date("19870913", dateformat"yyyymmdd")
1987-09-13

3.5.1.3 Extracting Date Information

It is easy to extract desired information from Date and DateTime objects. First, let’s create an instance of a very special date:

my_birthday = Date("1987-09-13")
1987-09-13

We can extract anything we want from my_birthday:

year(my_birthday)

1987
month(my_birthday)

9
day(my_birthday)

13

Julia’s Dates module also has compound functions that return a tuple of values:

yearmonth(my_birthday)
(1987, 9)
monthday(my_birthday)
(9, 13)
yearmonthday(my_birthday)
(1987, 9, 13)

We can also see the day of the week and other handy stuff:

dayofweek(my_birthday)

7
dayname(my_birthday)

Sunday
dayofweekofmonth(my_birthday)

2

Yep, Jose was born on the second Sunday of September.

NOTE: Here’s a handy tip to just recover weekdays from Dates instances. Just use a filter on dayofweek(your_date) <= 5. For business day you can checkout the BusinessDays.jl package.

3.5.1.4 Date Operations

We can perform operations in Dates instances. For example, we can add days to a Date or DateTime instance. Notice that Julia’s Dates will automatically perform the adjustments necessary for leap years, and for months with 30 or 31 days (this is known as calendrical arithmetic).

my_birthday + Day(90)
1987-12-12

We can add as many as we like:

my_birthday + Day(90) + Month(2) + Year(1)
1989-02-11

In case you’re ever wondering: “What can I do with dates again? What is available?” then you can use methodswith to check it out. We show only the first 20 results here:

first(methodswith(Date), 20)
[1] +(t::Time, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:20
[2] +(dt::Date, t::Time) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:19
[3] +(dt::Date, y::Year) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:27
[4] +(dt::Date, z::Month) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:54
[5] +(x::Date, y::Quarter) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:73
[6] +(x::Date, y::Week) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:77
[7] +(x::Date, y::Day) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:79
[8] -(dt::Date, y::Year) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:35
[9] -(dt::Date, z::Month) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:66
[10] -(x::Date, y::Quarter) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:74
[11] -(x::Date, y::Week) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:78
[12] -(x::Date, y::Day) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/arithmetic.jl:80
[13] convert(::Type{Day}, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/conversions.jl:37
[14] convert(::Type{DateTime}, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/conversions.jl:30
[15] floor(dt::Date, p::Year) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/rounding.jl:45
[16] floor(dt::Date, p::Month) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/rounding.jl:51
[17] floor(dt::Date, p::Quarter) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/rounding.jl:61
[18] floor(dt::Date, p::Week) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/rounding.jl:66
[19] floor(dt::Date, p::Day) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/rounding.jl:73
[20] print(io::IO, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.8.5/x64/share/julia/stdlib/v1.8/Dates/src/io.jl:722

From this, we can conclude that we can also use the plus + and minus - operator. Let’s see how old Jose is, in days:

today() - my_birthday
12921 days

The default duration of Date types is a Day instance. For the DateTime, the default duration is Millisecond instance:

DateTime(today()) - DateTime(my_birthday)
1116374400000 milliseconds

3.5.1.5 Date Intervals

One nice thing about Dates module is that we can also easily construct date and time intervals. Julia is clever enough to not have to define the whole interval types and operations that we covered in Section 3.3.6. It just extends the functions and operations defined for range to Date’s types. This is known as multiple dispatch and we already covered this in Why Julia? (Section 2).

For example, suppose that you want to create a Day interval. This is easy done with the colon : operator:

Date("2021-01-01"):Day(1):Date("2021-01-07")
2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07

There is nothing special in using Day(1) as the interval, we can use whatever Period type as interval. For example, using 3 days as the interval:

Date("2021-01-01"):Day(3):Date("2021-01-07")
2021-01-01
2021-01-04
2021-01-07

Or even months:

Date("2021-01-01"):Month(1):Date("2021-03-01")
2021-01-01
2021-02-01
2021-03-01

Note that the type of this interval is a StepRange with the Date and concrete Period type we used as interval inside the colon : operator:

date_interval = Date("2021-01-01"):Month(1):Date("2021-03-01")
typeof(date_interval)
StepRange{Date, Month}

We can convert this to a vector with the collect function:

collected_date_interval = collect(date_interval)
2021-01-01
2021-02-01
2021-03-01

And have all the array functionalities available, like, for example, indexing:

collected_date_interval[end]
2021-03-01

We can also broadcast date operations to our vector of Dates:

collected_date_interval .+ Day(10)
2021-01-11
2021-02-11
2021-03-11

Similarly, these examples work for DateTime types too.

3.5.2 Random Numbers

Another important module in Julia’s standard library is the Random module. This module deals with random number generation. Random is a rich library and, if you’re interested, you should consult Julia’s Random documentation. We will cover only three functions: rand, randn and seed!.

To begin, we first load the Random module. Since we know exactly what we want to load, we can just as well do that explicitly:

using Random: seed!

We have two main functions that generate random numbers:

3.5.2.1 rand

By default, if you call rand without arguments it will return a Float64 in the interval \([0, 1)\), which means between 0 inclusive to 1 exclusive:

rand()

0.8061872724135412

You can modify rand arguments in several ways. For example, suppose you want more than 1 random number:

rand(3)
[0.6360871725935285, 0.8147896702070012, 0.9535709340929223]

Or, you want a different interval:

rand(1.0:10.0)

3.0

You can also specify a different step size inside the interval and a different type. Here we are using numbers without the dot . so Julia will interpret them as Int64 and not as Float64:

rand(2:2:20)

20

You can also mix and match arguments:

rand(2:2:20, 3)
[8, 6, 20]

It also supports a collection of elements as a tuple:

rand((42, "Julia", 3.14))

42

And also arrays:

rand([1, 2, 3])

2

Dicts:

rand(Dict(:one => 1, :two => 2))
:two => 2

For all the rand arguments options, you can specify the desired random number dimensions in a tuple. If you do this, the returned type will be an array. For example, here’s a 2x2 matrix of Float64 numbers between 1.0 and 3.0:

rand(1.0:3.0, (2, 2))
2×2 Matrix{Float64}:
 2.0  1.0
 3.0  3.0

3.5.2.2 randn

randn follows the same general principle from rand but now it only returns numbers generated from the standard normal distribution. The standard normal distribution is the normal distribution with mean 0 and standard deviation 1. The default type is Float64 and it only allows for subtypes of AbstractFloat or Complex:

randn()

1.2278885924174743

We can only specify the size:

randn((2, 2))
2×2 Matrix{Float64}:
  0.0547187  -0.499309
 -1.41286     1.57851

3.5.2.3 seed!

To finish off the Random overview, let’s talk about reproducibility. Often, we want to make something replicable. Meaning that, we want the random number generator to generate the same random sequence of numbers. We can do so with the seed! function:

seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]

In some cases, calling seed! at the beginning of your script is not good enough. To avoid rand or randn to depend on a global variable, we can instead define an instance of a seed! and pass it as a first argument of either rand or randn.

my_seed = seed!(123)
Random.TaskLocalRNG()
rand(my_seed, 3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
rand(my_seed, 3)
[0.19090669902576285, 0.5256623915420473, 0.3905882754313441]

NOTE: Note that these numbers might differ for different Julia versions. To have stable streams across Julia versions use the StableRNGs.jl package.

3.5.3 Downloads

We’ll also cover the standard library’s Downloads module. It will be really brief because we will only be covering a single function named download.

Suppose you want to download a file from the internet to your local storage. You can accomplish this with the download function. The first and only required argument is the file’s url. You can also specify as a second argument the desired output path for the downloaded file (don’t forget the filesystem best practices!). If you don’t specify a second argument, Julia will, by default, create a temporary file with the tempfile function.

Let’s load the Downloads module:

using Downloads

For example, let’s download our JuliaDataScience GitHub repository Project.toml file. Note that download function is not exported by Downloads module, so we have to use the Module.function syntax. By default, it returns a string that holds the file path for the downloaded file:

url = "https://raw.githubusercontent.com/JuliaDataScience/JuliaDataScience/main/Project.toml"

my_file = Downloads.download(url) # tempfile() being created

/tmp/jl_14HUnAJEnA

With readlines, we can look at the first 4 lines of our downloaded file:

readlines(my_file)[1:4]
4-element Vector{String}:
 "name = \"JDS\""
 "uuid = \"6c596d62-2771-44f8-8373-3ec4b616ee9d\""
 "authors = [\"Jose Storopoli\", \"Rik Huijzer\", \"Lazaro Alonso\"]"
 ""

NOTE: For more complex HTTP interactions such as interacting with web APIs, see the HTTP.jl package.

3.5.4 Project Management

One last thing from Julia’s standard library for us to cover is the Pkg module. As described in Section 2.2, Julia offers a built-in package manager, with dependencies and version control tightly controlled, manageable, and replicable.

Unlike traditional package managers, which install and manage a single global set of packages, Julia’s package manager is designed around “environments”: independent sets of packages that can be local to an individual project or shared between projects. Each project maintains its own independent set of package versions.

3.5.4.1 Project.toml and Manifest.toml

Inside every project environment there is a simple setup involving .toml files in a folder. The folder, in this context, can be perceived as a “project” folder. The project environment is derived on two .toml files:

3.5.4.2 Creating Project Environments

In order to create a new project environment, you can enter the Pkg REPL mode by typing ] (right-bracket) in the Julia REPL:

julia>]

Then it becomes the Pkg REPL mode:

(@v1.8) pkg>

Here we can see that the REPL prompts changes from julia> to pkg>. There’s also additional information inside the parentheses regarding which project enviroment is currently active, (@v1.8). The v1.8 project environment is the default environment for your currently Julia installation (which in our case is Julia version 1.8.X).

NOTE: You can see a list of available commands in the Pkg REPL mode with the help command.

Julia has separate default environments for each minor release, the Xs in the 1.X Julia version. Anything that we perform in this default environment will impact any fresh Julia session on that version. Hence, we need to create a new environment by using the activate command:

(@v1.8) pkg> activate .
  Activating project at `~/user/folder`

(folder) pkg>

This activates a project environment in the directory that your Julia REPL is running. In my case this is located at ~/user/folder. Now we can start adding packages to our project environment with the add command in the Pkg REPL mode:

(folder) pkg> add DataFrames
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `~/user/folder/Project.toml`
  [a93c6f00] + DataFrames v1.4.3
    Updating `~/user/folder/Manifest.toml`
  [34da2185] + Compat v4.4.0
  [a8cc5b0e] + Crayons v4.1.1
  [9a962f9c] + DataAPI v1.13.0
  [a93c6f00] + DataFrames v1.4.3
  [864edb3b] + DataStructures v0.18.13
  [e2d170a0] + DataValueInterfaces v1.0.0
  [59287772] + Formatting v0.4.2
  [41ab1584] + InvertedIndices v1.1.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
  [b964fa9f] + LaTeXStrings v1.3.0
  [e1d29d7a] + Missings v1.0.2
  [bac558e1] + OrderedCollections v1.4.1
  [2dfb63ee] + PooledArrays v1.4.2
  [08abe8d2] + PrettyTables v2.2.1
  [189a3867] + Reexport v1.2.2
  [66db9d55] + SnoopPrecompile v1.0.1
  [a2af1166] + SortingAlgorithms v1.1.0
  [892a3eda] + StringManipulation v0.3.0
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.10.0
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [9fa8497b] + Future
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v0.5.2+0
  [4536629a] + OpenBLAS_jll v0.3.20+0
  [8e850b90] + libblastrampoline_jll v5.1.1+0

From the add output, we can see that Julia automatically creates both the Project.toml and Manifest.toml files. In the Project.toml, it adds a new package to the proejct environment package list. Here are the contents of the Project.toml:

[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"

This is a .toml file where:

Let’s also take a peek into the Manifest.toml. Here we will truncate the output since it is a big machine-generated file:

# This file is machine-generated - editing it directly is not advised

julia_version = "1.8.3"
manifest_format = "2.0"
project_hash = "376d427149ea94494cc22001edd58d53c9b2bee1"

[[deps.Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"

...

[[deps.DataFrames]]
deps = ["Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Random", "Reexport", "SnoopPrecompile", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
git-tree-sha1 = "0f44494fe4271cc966ac4fea524111bef63ba86c"
uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
version = "1.4.3"

...

[[deps.libblastrampoline_jll]]
deps = ["Artifacts", "Libdl", "OpenBLAS_jll"]
uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
version = "5.1.1+0"

The three dots above (...) represents truncated output. First, the Manifest.toml presents us a comment saying that it is machine-generated and discouragin editing it directly. Then, there are entries for the Julia version (julia_version), Manifest.toml format version (manifest_format), and project environment hash (project_hash). Finally, it proceeds with a TOML array of tables which are the double brackets entries ([[...]]). These entries stands for the dependencies of all packages necessary to create the environment described in the Project.toml. Therefore all of the DataFrames.jl‘s dependencies and its dependencies’ dependencies (and so on…) are listed here with their name, UUID, and version.

NOTE: Julia’s standard library module do not have a version key in the Manifest.toml because they are already specified by the Julia version (julia_version). This is the case for the Artifacts entry in the truncated Manifest.toml output above, since it is a module in Julia’s standard library.

We can keep adding as many packages as we like with the add command. To remove a package you can use the rm command in the Pkg REPL mode:

(folder) pkg> rm DataFrames
    Updating `~/user/folder/Project.toml`
  [a93c6f00] - DataFrames v1.4.3
    Updating `~/user/folder/Manifest.toml`
  [34da2185] - Compat v4.4.0
  [a8cc5b0e] - Crayons v4.1.1
  [9a962f9c] - DataAPI v1.13.0
  [a93c6f00] - DataFrames v1.4.3
  [864edb3b] - DataStructures v0.18.13
  [e2d170a0] - DataValueInterfaces v1.0.0
  [59287772] - Formatting v0.4.2
  [41ab1584] - InvertedIndices v1.1.0
  [82899510] - IteratorInterfaceExtensions v1.0.0
  [b964fa9f] - LaTeXStrings v1.3.0
  [e1d29d7a] - Missings v1.0.2
  [bac558e1] - OrderedCollections v1.4.1
  [2dfb63ee] - PooledArrays v1.4.2
  [08abe8d2] - PrettyTables v2.2.1
  [189a3867] - Reexport v1.2.2
  [66db9d55] - SnoopPrecompile v1.0.1
  [a2af1166] - SortingAlgorithms v1.1.0
  [892a3eda] - StringManipulation v0.3.0
  [3783bdb8] - TableTraits v1.0.1
  [bd369af6] - Tables v1.10.0
  [56f22d72] - Artifacts
  [2a0f44e3] - Base64
  [ade2ca70] - Dates
  [9fa8497b] - Future
  [b77e0a4c] - InteractiveUtils
  [8f399da3] - Libdl
  [37e2e46d] - LinearAlgebra
  [56ddb016] - Logging
  [d6f4376e] - Markdown
  [de0858da] - Printf
  [3fa0cd96] - REPL
  [9a3f8284] - Random
  [ea8e919c] - SHA v0.7.0
  [9e88b42a] - Serialization
  [6462fe0b] - Sockets
  [2f01184e] - SparseArrays
  [10745b16] - Statistics
  [8dfed614] - Test
  [cf7118a7] - UUIDs
  [4ec0a83e] - Unicode
  [e66e0078] - CompilerSupportLibraries_jll v0.5.2+0
  [4536629a] - OpenBLAS_jll v0.3.20+0
  [8e850b90] - libblastrampoline_jll v5.1.1+0

NOTE: Julia’s Pkg REPL mode supports autocompletion with <TAB>. You can, for example, in the above command start typing rm DataF<TAB> and it will autocomplete to rm DataFrames.

We can see that rm DataFrames undoes add DataFrame by removing entries in both Project.toml and Manifest.toml.

3.5.4.3 Sharing Project Environments

Once you have a project environment with both the Project.toml and Manifest.toml files, you can share it with any user to have a perfectly reproducible project environment.

Now let’s cover the other end of the process. Suppose you received a Project.toml and a Manifest.toml from someone.

How would you proceed to instantiate a Julia project environment?

It is a simple process:

  1. Put the Project.toml and a Manifest.toml files into a folder.
  2. Open a Julia REPL in that folder and activate it as a project environment with ]activate.
  3. Instantiate the environment with the ]instantiate command.

That’s it! Once the project environment finished downloading and instantiating the dependencies listed in the Project.toml and Manifest.toml files, you’ll have an exact copy of the project environment sent to you.

NOTE: You can also add [compat] bounds in the Project.toml to specify which package versions your project environment is compatible with. This is an advanced-user functionality which we will not cover. Take a look at the Pkg.jl standard library module documentation on compatibility. For people new to Julia, we recommend sharing both Project.toml and Manifest.toml for a fully reproducible environment.



Support this project
CC BY-NC-SA 4.0 Jose Storopoli, Rik Huijzer, Lazaro Alonso