## 3.5 Julia Standard Library

Julia has a rich standard library that is available with every Julia installation. Contrary to everything that we have seen so far, e.g. types, data structures and filesystem; you must load standard library modules into your environment to use a particular module or function.

This is done via using or import. In this book, we will load code via using:

using ModuleName

After doing this, you can access all functions and types inside ModuleName.

### 3.5.1 Dates

Knowing how to handle dates and timestamps is important in data science. As we said in Why Julia? (Section 2) section, Python’s pandas uses its own datetime type to handle dates. The same is true in the R tidyverse’s lubridate package, which also defines its own datetime type to handle dates. In Julia packages don’t need to write their own dates logic, because Julia has a dates module in its standard library called Dates.

To begin, let’s load the Dates module:

using Dates

#### 3.5.1.1Date and DateTime Types

The Dates standard library module has two types for working with dates:

1. Date: representing time in days and
2. DateTime: representing time in millisecond precision.

We can construct Date and DateTime with the default constructor either by specifying an integer to represent year, month, day, hours and so on:

Date(1987) # year
1987-01-01
Date(1987, 9) # year, month
1987-09-01
Date(1987, 9, 13) # year, month, day
1987-09-13
DateTime(1987, 9, 13, 21) # year, month, day, hour
1987-09-13T21:00:00
DateTime(1987, 9, 13, 21, 21) # year, month, day, hour, minute
1987-09-13T21:21:00

For the curious, September 13th 1987, 21:21 is the official time of birth of the first author, Jose.

We can also pass Period types to the default constructor. Period types are the human-equivalent representation of time for the computer. Julia’s Dates have the following Period abstract subtypes:

subtypes(Period)
DatePeriod
TimePeriod

which divide into the following concrete types, and they are pretty much self-explanatory:

subtypes(DatePeriod)
Day
Month
Quarter
Week
Year
subtypes(TimePeriod)
Hour
Microsecond
Millisecond
Minute
Nanosecond
Second

So, we could alternatively construct Jose’s official time of birth as:

DateTime(Year(1987), Month(9), Day(13), Hour(21), Minute(21))
1987-09-13T21:21:00

#### 3.5.1.2 Parsing Dates

Most of the time, we won’t be constructing Date or DateTime instances from scratch. Actually, we will probably be parsing strings as Date or DateTime types.

The Date and DateTime constructors can be fed a string and a format string. For example, the string "19870913" representing September 13th 1987 can be parsed with:

Date("19870913", "yyyymmdd")
1987-09-13

Notice that the second argument is a string representation of the format. We have the first four digits representing year y, followed by two digits for month m and finally two digits for day d.

It also works for timestamps with DateTime:

DateTime("1987-09-13T21:21:00", "yyyy-mm-ddTHH:MM:SS")
1987-09-13T21:21:00

You can find more on how to specify different date formats in the Julia Dates’ documentation. Don’t worry if you have to revisit it all the time, we ourselves do that too when working with dates and timestamps.

According to Julia Dates’ documentation, using the Date(date_string, format_string) method is fine if it’s only called a few times. If there are many similarly formatted date strings to parse, however, it is much more efficient to first create a DateFormat type, and then pass it instead of a raw format string. Then, our previous example becomes:

format = DateFormat("yyyymmdd")
Date("19870913", format)
1987-09-13

Alternatively, without loss of performance, you can use the string literal prefix dateformat"...":

Date("19870913", dateformat"yyyymmdd")
1987-09-13

#### 3.5.1.3 Extracting Date Information

It is easy to extract desired information from Date and DateTime objects. First, let’s create an instance of a very special date:

my_birthday = Date("1987-09-13")
1987-09-13

We can extract anything we want from my_birthday:

year(my_birthday)
1987
month(my_birthday)
9
day(my_birthday)
13

Julia’s Dates module also has compound functions that return a tuple of values:

yearmonth(my_birthday)
(1987, 9)
monthday(my_birthday)
(9, 13)
yearmonthday(my_birthday)
(1987, 9, 13)

We can also see the day of the week and other handy stuff:

dayofweek(my_birthday)
7
dayname(my_birthday)
Sunday
dayofweekofmonth(my_birthday)
2

Yep, Jose was born on the second Sunday of September.

NOTE: Here’s a handy tip to just recover weekdays from Dates instances. Just use a filter on dayofweek(your_date) <= 5. For business day you can checkout the BusinessDays.jl package.

#### 3.5.1.4 Date Operations

We can perform operations in Dates instances. For example, we can add days to a Date or DateTime instance. Notice that Julia’s Dates will automatically perform the adjustments necessary for leap years, and for months with 30 or 31 days (this is known as calendrical arithmetic).

my_birthday + Day(90)
1987-12-12

We can add as many as we like:

my_birthday + Day(90) + Month(2) + Year(1)
1989-02-11

In case you’re ever wondering: “What can I do with dates again? What is available?” then you can use methodswith to check it out. We show only the first 20 results here:

first(methodswith(Date), 20)
[1] show(io::IO, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/io.jl:736
[2] show(io::IO, ::MIME{Symbol("text/plain")}, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/io.jl:734
[3] DateTime(dt::Date, t::Time) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/types.jl:403
[4] Day(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[5] Month(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[6] Quarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[7] Week(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[8] Year(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[9] firstdayofmonth(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:84
[10] firstdayofquarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:157
[11] firstdayofweek(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:52
[12] firstdayofyear(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:119
[13] lastdayofmonth(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:100
[14] lastdayofquarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:180
[15] lastdayofweek(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:68
[16] lastdayofyear(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:135
[17] +(dt::Date, t::Time) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:19
[18] +(dt::Date, y::Year) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:27
[19] +(dt::Date, z::Month) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:54
[20] +(x::Date, y::Quarter) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:73

From this, we can conclude that we can also use the plus + and minus - operator. Let’s see how old Jose is, in days:

today() - my_birthday
12687 days

The default duration of Date types is a Day instance. For the DateTime, the default duration is Millisecond instance:

DateTime(today()) - DateTime(my_birthday)
1096156800000 milliseconds

#### 3.5.1.5 Date Intervals

One nice thing about Dates module is that we can also easily construct date and time intervals. Julia is clever enough to not have to define the whole interval types and operations that we covered in Section 3.3.6. It just extends the functions and operations defined for range to Date’s types. This is known as multiple dispatch and we already covered this in Why Julia? (Section 2).

For example, suppose that you want to create a Day interval. This is easy done with the colon : operator:

Date("2021-01-01"):Day(1):Date("2021-01-07")
2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07

There is nothing special in using Day(1) as the interval, we can use whatever Period type as interval. For example, using 3 days as the interval:

Date("2021-01-01"):Day(3):Date("2021-01-07")
2021-01-01
2021-01-04
2021-01-07

Or even months:

Date("2021-01-01"):Month(1):Date("2021-03-01")
2021-01-01
2021-02-01
2021-03-01

Note that the type of this interval is a StepRange with the Date and concrete Period type we used as interval inside the colon : operator:

date_interval = Date("2021-01-01"):Month(1):Date("2021-03-01")
typeof(date_interval)
StepRange{Date, Month}

We can convert this to a vector with the collect function:

collected_date_interval = collect(date_interval)
2021-01-01
2021-02-01
2021-03-01

And have all the array functionalities available, like, for example, indexing:

collected_date_interval[end]
2021-03-01

We can also broadcast date operations to our vector of Dates:

collected_date_interval .+ Day(10)
2021-01-11
2021-02-11
2021-03-11

Similarly, these examples work for DateTime types too.

### 3.5.2 Random Numbers

Another important module in Julia’s standard library is the Random module. This module deals with random number generation. Random is a rich library and, if you’re interested, you should consult Julia’s Random documentation. We will cover only three functions: rand, randn and seed!.

To begin, we first load the Random module. Since we know exactly what we want to load, we can just as well do that explicitly:

using Random: seed!

We have two main functions that generate random numbers:

• rand: samples a random element of a data structure or type.
• randn: samples a random number from a standard normal distribution (mean 0 and standard deviation 1).

#### 3.5.2.1rand

By default, if you call rand without arguments it will return a Float64 in the interval $$[0, 1)$$, which means between 0 inclusive to 1 exclusive:

rand()
0.18600655050915005

You can modify rand arguments in several ways. For example, suppose you want more than 1 random number:

rand(3)
[0.6348331348677316, 0.7644789964575275, 0.6930821877797426]

Or, you want a different interval:

rand(1.0:10.0)
4.0

You can also specify a different step size inside the interval and a different type. Here we are using numbers without the dot . so Julia will interpret them as Int64 and not as Float64:

rand(2:2:20)
6

You can also mix and match arguments:

rand(2:2:20, 3)
[14, 12, 8]

It also supports a collection of elements as a tuple:

rand((42, "Julia", 3.14))
42

And also arrays:

rand([1, 2, 3])
2

Dicts:

rand(Dict(:one => 1, :two => 2))
:two => 2

For all the rand arguments options, you can specify the desired random number dimensions in a tuple. If you do this, the returned type will be an array. For example, here’s a 2x2 matrix of Float64 numbers between 1.0 and 3.0:

rand(1.0:3.0, (2, 2))
2×2 Matrix{Float64}:
3.0  1.0
3.0  1.0

#### 3.5.2.2randn

randn follows the same general principle from rand but now it only returns numbers generated from the standard normal distribution. The standard normal distribution is the normal distribution with mean 0 and standard deviation 1. The default type is Float64 and it only allows for subtypes of AbstractFloat or Complex:

randn()
-0.616456676630725

We can only specify the size:

randn((2, 2))
2×2 Matrix{Float64}:
-2.72294   2.18743
0.462197  0.251399

#### 3.5.2.3seed!

To finish off the Random overview, let’s talk about reproducibility. Often, we want to make something replicable. Meaning that, we want the random number generator to generate the same random sequence of numbers. We can do so with the seed! function:

seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]

In some cases, calling seed! at the beginning of your script is not good enough. To avoid rand or randn to depend on a global variable, we can instead define an instance of a seed! and pass it as a first argument of either rand or randn.

my_seed = seed!(123)
Random.TaskLocalRNG()
rand(my_seed, 3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
rand(my_seed, 3)
[0.19090669902576285, 0.5256623915420473, 0.3905882754313441]

NOTE: Note that these numbers might differ for different Julia versions. To have stable streams across Julia versions use the StableRNGs.jl package.

One last thing from Julia’s standard library for us to cover is the Downloads module. It will be really brief because we will only be covering a single function named download.

Suppose you want to download a file from the internet to your local storage. You can accomplish this with the download function. The first and only required argument is the file’s url. You can also specify as a second argument the desired output path for the downloaded file (don’t forget the filesystem best practices!). If you don’t specify a second argument, Julia will, by default, create a temporary file with the tempfile function.

Let’s load the Downloads module:

using Downloads

For example, let’s download our JuliaDataScience GitHub repository Project.toml file. Note that download function is not exported by Downloads module, so we have to use the Module.function syntax. By default, it returns a string that holds the file path for the downloaded file:

url = "https://raw.githubusercontent.com/JuliaDataScience/JuliaDataScience/main/Project.toml"

my_file = Downloads.download(url) # tempfile() being created
/tmp/jl_DfJiF3

With readlines, we can look at the first 4 lines of our downloaded file:

readlines(my_file)[1:4]
4-element Vector{String}:
"name = \"JDS\""
"uuid = \"6c596d62-2771-44f8-8373-3ec4b616ee9d\""
"authors = [\"Jose Storopoli\", \"Rik Huijzer\", \"Lazaro Alonso\"]"
"version = \"0.1.0\""

NOTE: For more complex HTTP interactions such as interacting with web APIs, see the HTTP.jl package.

CC BY-NC-SA 4.0 Jose Storopoli, Rik Huijzer, Lazaro Alonso