Julia has a rich standard library that is available with every Julia installation. Contrary to everything that we have seen so far, e.g. types, data structures and filesystem; you must load standard library modules into your environment to use a particular module or function.
This is done via using
or import
. In this book, we will load code via using
:
using ModuleName
After doing this, you can access all functions and types inside ModuleName
.
Knowing how to handle dates and timestamps is important in data science. As we said in Why Julia? (Section 2) section, Python’s pandas
uses its own datetime
type to handle dates. The same is true in the R tidyverse’s lubridate
package, which also defines its own datetime
type to handle dates. In Julia packages don’t need to write their own dates logic, because Julia has a dates module in its standard library called Dates
.
To begin, let’s load the Dates
module:
using Dates
Date
and DateTime
TypesThe Dates
standard library module has two types for working with dates:
Date
: representing time in days andDateTime
: representing time in millisecond precision.We can construct Date
and DateTime
with the default constructor either by specifying an integer to represent year, month, day, hours and so on:
Date(1987) # year
1987-01-01
Date(1987, 9) # year, month
1987-09-01
Date(1987, 9, 13) # year, month, day
1987-09-13
DateTime(1987, 9, 13, 21) # year, month, day, hour
1987-09-13T21:00:00
DateTime(1987, 9, 13, 21, 21) # year, month, day, hour, minute
1987-09-13T21:21:00
For the curious, September 13th 1987, 21:21 is the official time of birth of the first author, Jose.
We can also pass Period
types to the default constructor. Period
types are the human-equivalent representation of time for the computer. Julia’s Dates
have the following Period
abstract subtypes:
subtypes(Period)
DatePeriod
TimePeriod
which divide into the following concrete types, and they are pretty much self-explanatory:
subtypes(DatePeriod)
Day
Month
Quarter
Week
Year
subtypes(TimePeriod)
Hour
Microsecond
Millisecond
Minute
Nanosecond
Second
So, we could alternatively construct Jose’s official time of birth as:
DateTime(Year(1987), Month(9), Day(13), Hour(21), Minute(21))
1987-09-13T21:21:00
Most of the time, we won’t be constructing Date
or DateTime
instances from scratch. Actually, we will probably be parsing strings as Date
or DateTime
types.
The Date
and DateTime
constructors can be fed a string and a format string. For example, the string "19870913"
representing September 13th 1987 can be parsed with:
Date("19870913", "yyyymmdd")
1987-09-13
Notice that the second argument is a string representation of the format. We have the first four digits representing year y
, followed by two digits for month m
and finally two digits for day d
.
It also works for timestamps with DateTime
:
DateTime("1987-09-13T21:21:00", "yyyy-mm-ddTHH:MM:SS")
1987-09-13T21:21:00
You can find more on how to specify different date formats in the Julia Dates
’ documentation. Don’t worry if you have to revisit it all the time, we ourselves do that too when working with dates and timestamps.
According to Julia Dates
’ documentation, using the Date(date_string, format_string)
method is fine if it’s only called a few times. If there are many similarly formatted date strings to parse, however, it is much more efficient to first create a DateFormat
type, and then pass it instead of a raw format string. Then, our previous example becomes:
format = DateFormat("yyyymmdd")
Date("19870913", format)
1987-09-13
Alternatively, without loss of performance, you can use the string literal prefix dateformat"..."
:
Date("19870913", dateformat"yyyymmdd")
1987-09-13
It is easy to extract desired information from Date
and DateTime
objects. First, let’s create an instance of a very special date:
my_birthday = Date("1987-09-13")
1987-09-13
We can extract anything we want from my_birthday
:
year(my_birthday)
1987
month(my_birthday)
9
day(my_birthday)
13
Julia’s Dates
module also has compound functions that return a tuple of values:
yearmonth(my_birthday)
(1987, 9)
monthday(my_birthday)
(9, 13)
yearmonthday(my_birthday)
(1987, 9, 13)
We can also see the day of the week and other handy stuff:
dayofweek(my_birthday)
7
dayname(my_birthday)
Sunday
dayofweekofmonth(my_birthday)
2
Yep, Jose was born on the second Sunday of September.
NOTE: Here’s a handy tip to just recover weekdays from
Dates
instances. Just use afilter
ondayofweek(your_date) <= 5
. For business day you can checkout theBusinessDays.jl
package.
We can perform operations in Dates
instances. For example, we can add days to a Date
or DateTime
instance. Notice that Julia’s Dates
will automatically perform the adjustments necessary for leap years, and for months with 30 or 31 days (this is known as calendrical arithmetic).
my_birthday + Day(90)
1987-12-12
We can add as many as we like:
my_birthday + Day(90) + Month(2) + Year(1)
1989-02-11
In case you’re ever wondering: “What can I do with dates again? What is available?” then you can use methodswith
to check it out. We show only the first 20 results here:
first(methodswith(Date), 20)
[1] show(io::IO, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/io.jl:736
[2] show(io::IO, ::MIME{Symbol("text/plain")}, dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/io.jl:734
[3] DateTime(dt::Date, t::Time) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/types.jl:403
[4] Day(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[5] Month(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[6] Quarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[7] Week(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[8] Year(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/periods.jl:36
[9] firstdayofmonth(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:84
[10] firstdayofquarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:157
[11] firstdayofweek(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:52
[12] firstdayofyear(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:119
[13] lastdayofmonth(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:100
[14] lastdayofquarter(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:180
[15] lastdayofweek(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:68
[16] lastdayofyear(dt::Date) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/adjusters.jl:135
[17] +(dt::Date, t::Time) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:19
[18] +(dt::Date, y::Year) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:27
[19] +(dt::Date, z::Month) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:54
[20] +(x::Date, y::Quarter) in Dates at /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Dates/src/arithmetic.jl:73
From this, we can conclude that we can also use the plus +
and minus -
operator. Let’s see how old Jose is, in days:
today() - my_birthday
12687 days
The default duration of Date
types is a Day
instance. For the DateTime
, the default duration is Millisecond
instance:
DateTime(today()) - DateTime(my_birthday)
1096156800000 milliseconds
One nice thing about Dates
module is that we can also easily construct date and time intervals. Julia is clever enough to not have to define the whole interval types and operations that we covered in Section 3.3.6. It just extends the functions and operations defined for range to Date
’s types. This is known as multiple dispatch and we already covered this in Why Julia? (Section 2).
For example, suppose that you want to create a Day
interval. This is easy done with the colon :
operator:
Date("2021-01-01"):Day(1):Date("2021-01-07")
2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07
There is nothing special in using Day(1)
as the interval, we can use whatever Period
type as interval. For example, using 3 days as the interval:
Date("2021-01-01"):Day(3):Date("2021-01-07")
2021-01-01
2021-01-04
2021-01-07
Or even months:
Date("2021-01-01"):Month(1):Date("2021-03-01")
2021-01-01
2021-02-01
2021-03-01
Note that the type of this interval is a StepRange
with the Date
and concrete Period
type we used as interval inside the colon :
operator:
date_interval = Date("2021-01-01"):Month(1):Date("2021-03-01")
typeof(date_interval)
StepRange{Date, Month}
We can convert this to a vector with the collect
function:
collected_date_interval = collect(date_interval)
2021-01-01
2021-02-01
2021-03-01
And have all the array functionalities available, like, for example, indexing:
collected_date_interval[end]
2021-03-01
We can also broadcast date operations to our vector of Date
s:
collected_date_interval .+ Day(10)
2021-01-11
2021-02-11
2021-03-11
Similarly, these examples work for DateTime
types too.
Another important module in Julia’s standard library is the Random
module. This module deals with random number generation. Random
is a rich library and, if you’re interested, you should consult Julia’s Random
documentation. We will cover only three functions: rand
, randn
and seed!
.
To begin, we first load the Random
module. Since we know exactly what we want to load, we can just as well do that explicitly:
using Random: seed!
We have two main functions that generate random numbers:
rand
: samples a random element of a data structure or type.randn
: samples a random number from a standard normal distribution (mean 0 and standard deviation 1).rand
By default, if you call rand
without arguments it will return a Float64
in the interval \([0, 1)\), which means between 0 inclusive to 1 exclusive:
rand()
0.18600655050915005
You can modify rand
arguments in several ways. For example, suppose you want more than 1 random number:
rand(3)
[0.6348331348677316, 0.7644789964575275, 0.6930821877797426]
Or, you want a different interval:
rand(1.0:10.0)
4.0
You can also specify a different step size inside the interval and a different type. Here we are using numbers without the dot .
so Julia will interpret them as Int64
and not as Float64
:
rand(2:2:20)
6
You can also mix and match arguments:
rand(2:2:20, 3)
[14, 12, 8]
It also supports a collection of elements as a tuple:
rand((42, "Julia", 3.14))
42
And also arrays:
rand([1, 2, 3])
2
Dict
s:
rand(Dict(:one => 1, :two => 2))
:two => 2
For all the rand
arguments options, you can specify the desired random number dimensions in a tuple. If you do this, the returned type will be an array. For example, here’s a 2x2 matrix of Float64
numbers between 1.0 and 3.0:
rand(1.0:3.0, (2, 2))
2×2 Matrix{Float64}:
3.0 1.0
3.0 1.0
randn
randn
follows the same general principle from rand
but now it only returns numbers generated from the standard normal distribution. The standard normal distribution is the normal distribution with mean 0 and standard deviation 1. The default type is Float64
and it only allows for subtypes of AbstractFloat
or Complex
:
randn()
-0.616456676630725
We can only specify the size:
randn((2, 2))
2×2 Matrix{Float64}:
-2.72294 2.18743
0.462197 0.251399
seed!
To finish off the Random
overview, let’s talk about reproducibility. Often, we want to make something replicable. Meaning that, we want the random number generator to generate the same random sequence of numbers. We can do so with the seed!
function:
seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
seed!(123)
rand(3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
In some cases, calling seed!
at the beginning of your script is not good enough. To avoid rand
or randn
to depend on a global variable, we can instead define an instance of a seed!
and pass it as a first argument of either rand
or randn
.
my_seed = seed!(123)
Random.TaskLocalRNG()
rand(my_seed, 3)
[0.521213795535383, 0.5868067574533484, 0.8908786980927811]
rand(my_seed, 3)
[0.19090669902576285, 0.5256623915420473, 0.3905882754313441]
NOTE: Note that these numbers might differ for different Julia versions. To have stable streams across Julia versions use the
StableRNGs.jl
package.
One last thing from Julia’s standard library for us to cover is the Downloads
module. It will be really brief because we will only be covering a single function named download
.
Suppose you want to download a file from the internet to your local storage. You can accomplish this with the download
function. The first and only required argument is the file’s url. You can also specify as a second argument the desired output path for the downloaded file (don’t forget the filesystem best practices!). If you don’t specify a second argument, Julia will, by default, create a temporary file with the tempfile
function.
Let’s load the Downloads
module:
using Downloads
For example, let’s download our JuliaDataScience
GitHub repository Project.toml
file. Note that download
function is not exported by Downloads
module, so we have to use the Module.function
syntax. By default, it returns a string that holds the file path for the downloaded file:
url = "https://raw.githubusercontent.com/JuliaDataScience/JuliaDataScience/main/Project.toml"
my_file = Downloads.download(url) # tempfile() being created
/tmp/jl_DfJiF3
With readlines
, we can look at the first 4 lines of our downloaded file:
readlines(my_file)[1:4]
4-element Vector{String}:
"name = \"JDS\""
"uuid = \"6c596d62-2771-44f8-8373-3ec4b616ee9d\""
"authors = [\"Jose Storopoli\", \"Rik Huijzer\", \"Lazaro Alonso\"]"
"version = \"0.1.0\""
NOTE: For more complex HTTP interactions such as interacting with web APIs, see the
HTTP.jl
package.