Rust - Generics

Generic programming enables programmers to create general algorithms that can work with any type. It eliminates code duplication and ensures type safety, allowing us to write more concise and clean code. In other languages, this approach is known as parametric polymorphism or templates.

Rust recognizes two types of generic code:

  1. Generics at compile time, similar to C++ templates.
  2. Run-time generics are similar to virtual functions in C++ and Java generics.

This article will discuss compile-time generics (also known as generics) and the following questions:

  • Why do we employ generics?
  • In Rust, how do you use generic types?
  • How do I use generics to define custom types, functions, and methods?
  • How can I limit generics with traits?

What exactly are generics in Rust?

Let's start with an example of creating and printing an integer vector:

let mut int_vector = Vec::new();

int_vector.push(1);
int_vector.push(2);
int_vector.push(3);

for i in &int_vector {
    println!("{}", i);
}
// Prints:
// 1
// 2
// 3

To create a new empty vector, we use the Vec::new function, then add multiple elements with push and print all of them.

Rust is a statically typed language, which means that at compile time, the compiler should know the types of all variables. The compiler recognizes int vector as a vector of integers after we insert the first integer, and it cannot store any other type. When we try to push 4.0, the code fails to compile:

error[E0308]: mismatched types
   |
   | int_vector.push(4.0);
   | ---- ^^^ expected integer, found floating-point number
   | |
   | arguments to this function are incorrect

Using strings, we can create a similar programme:

let mut str_vector = Vec::new();

str_vector.push("one");
str_vector.push("two");
str_vector.push("three");

for i in &str_vector {
    println!("{}", i);
}
// Prints:
// one
// two
// three

When we try to push 4 this time, the code fails to compile:

error[E0308]: mismatched types
   |
   | str_vector.push(4);
   | ---- ^^^ expected `&str`, found integer
   | |
   | arguments to this function are incorrect

Apart from the name and elements, both code snippets appear to be very similar. We created and used two vectors with Vec::new: one for integers and another for strings. Because vectors are implemented using generics, this is possible. When we use new, we create a new empty Vec<T> that can contain any type.

The <T> syntax

When declaring generic types, we use the <T> syntax. The type parameter is denoted by T, which is written between the angle brackets. T can represent any data type, which means there is no way to know what T is at runtime - we can't pattern match on it, we can't perform type-specific operations on it, and we can't have different code paths for different Ts.

Any letter can be used to represent a type parameter, but T is the most commonly used.

Based on the value and how we use it, the compiler can frequently infer the type we want to use. When we insert a value of a specific type into a vector, the compiler determines the type of elements we want to store and infers the vector type. When we filled int vector with i32 values, Rust inferred that its type is Vec<i32>.

The code does not compile if we do not insert any elements into the vector and do not provide any hints about the vector's elements:

let mut int_vector = Vec::new();

for i in &int_vector {
    println!("{}", i);
}

error[E0282]: type annotations needed for `Vec<T>`
   |
   | let mut int_vector = Vec::new();
   | ^^^^^^^^^^^^^^
   |
help: consider giving `int_vector` an explicit type
    , where the type for type parameter `T` is specified
   |
   | let mut int_vector: Vec<T> = Vec::new();
   | ++++++++

Rust has no idea what kind of elements we intend to store because we are no longer inserting values into the vector. The error indicates that we should add an explicit type. We can tell the compiler with a type annotation that the Vec<T> in int vector will hold elements of the i32 type:

// We specify the type within angle brackets
let mut int_vector: Vec<i32> = Vec::new();

We cannot leave a vector without a type. We must tell the compiler whether we will use a concrete type explicitly (via type annotations) or implicitly (by using specific types).

Consider another example, a HashMap collection, which has two generic types: key and value:

use std::collections::HashMap;

let mut score_map: HashMap<&str, i32> = HashMap::new();
score_map.insert("Jamie", 16);
score_map.insert("Ashley", 71);

let mut codes_map: HashMap<u32, &str> = HashMap::new();
codes_map.insert(282, "type annotation error");
codes_map.insert(308, "mismatched types error");

let mut flags_map: HashMap<&str, bool> = HashMap::new();
flags_map.insert("required", true);

A hash map can be created with any combination of keys and values. We don't need to create a collection for each possible type (e.g., StringIntMap, StringBoolMap, and so on); instead, we can use a single collection and specify the types we want.

Multiple generic parameters can be defined by separating them with commas: 

fn take_three<T, U, V>(t: T, u: U, v: V) {}

It is customary to name generic types beginning with the letter T and progressing alphabetically. The names of keys and values, for example, can be more meaningful at times:

HashMap<K, V>

How do generics work?

In some ways, when we define a generic collection (for example, Vec<T>), we are defining an infinite set of collections (for example, Vec<i32>, Vec<i64>, and Vec<&str>).

When Rust compiles generic code, it uses a technique known as mesomorphication. We write generic code that works for any arbitrary type T, and Rust generates specific uses of the code at compile time by filling in the concrete types. It will generate code for Vec<i32> and Vec<&str>, for example, but the resulting code will differ due to how each type works underneath.

We can use generics to write code for multiple data types rather than rewriting the same code for each type, making the code more straightforward and error-prone.

Rust can use generics in a variety of places:

  • Function and method definitions.
  • Struct and enum definitions.
  • Impl blocks.
  • Trait definitions.
  • Type aliases.

Generic definitions

Starting with generic functions, we will go over generics in various Rust definitions.

Generic functions

Let's experiment with some vectors. Assume we want to write a function that reverses elements of any vector, including the int vector and str vector functions we've already seen. We can create specialized functions for various types of vectors, such as int_reverse and str reverse:

fn int_reverse(vector: &mut Vec<int>) {...}


fn str_reverse(vector: &mut Vec<&str>) {...}

Because the actual type of the elements is irrelevant when reversing a vector, these functions would have the same implementation and only differ in the types of their parameters. This is where generic functions come into play: we can write a single generic reverse function that is applicable to any type of vector element.

// [1][2] [3]
fn reverse<T>(vector: &mut Vec<T>) {
    let mut new_vector = Vec::new();

    while let Some(last) = vector.pop() {
        new_vector.push(last);
    }

    *vector = new_vector;
}

// [1]: The generic definition follows the function name.
// [2]: The function is generic over the type `T`.
// [3]: The function takes a reference to a generic vector as an argument.

The while loop is used by the reverse function to iterate over the elements of a vector. Because pop removes and returns the last element of a vector, we get the elements in reverse order.

This generic function can be applied to both of our vectors, and any other vector:

let mut int_vector: Vec<i32> = vec![1, 2, 3];
reverse(&mut int_vector);
println!("After: {:?}", int_vector);
// Prints:
// After: [3, 2, 1]

let mut str_vector: Vec<&str> = vec!["one", "two", "three"];
reverse(&mut str_vector);
println!("After: {:?}", str_vector);
// Prints:
// After: ["three", "two", "one"]

let mut bool_vector: Vec<bool> = vec![true, false, false, false];
reverse(&mut bool_vector);
println!("After: {:?}", bool_vector);
// Prints:
// After: [false, false, false, true]

This time, we'll use the vec! macro to generate a vector with initial values and pretty-print it with {:?} The reverse works, but don't try it in production; instead, use the built-in vector.reverse function.

To summarize, with generics, we write one function and delegate to the compiler the task of writing repetitive code for all the different types.

Generic enums

Generics can also be used in other definitions. Option<T>, for example, is a generic enum that is frequently encountered in Rust:

// [1][2]
pub enum Option<T> {
  None,
// [3]
  Some(T),
}

// [1]: The generic definition follows the enum name.
// [2]: The enum is generic over the type `T`.
// [3]: The `Some` takes an argument of type `T`.

Because it’s generic, we can use Option to wrap any type we like:

let o1: Option<i32> = Some(1);
let o2: Option<&str> = Some("two");
let o3: Option<bool> = Some(true);
let o4: Option<f64> = Some(4.0);
...

For example, we can use it to wrap integers (Option<i32>) and implement a safe division function:

fn safe_division(a: i32, b: i32) -> Option<i32> {
    match b {
        0 => None,
        _ => Some(a / b),
    }
}

println!("{:?}", safe_division(4, 2));
// Prints:
// Some(2)

Or we can use Option to safely access the elements of the str_vector:

let str_vector = vec!["one", "two", "three"];

let head = str_vector.get(0);
println!("{:?}", head);
// Prints:
// Some("one")

Note that the get method is a safe way to access an element of the vector at a given position: it returns either a reference to the element wrapped in Some or None if the position is out of bounds.

Generic structs

We can create generic structs in the same way. Assume we need to create a struct that stores user scores. If our programme supports multiple types of scores, we don't want to repeat ourselves by writing code for each score type; instead, we can keep the score generic:

// [1][2]
struct Score<T> {
    user: String,
// [3]
    team_score: T,
// [4]
    scores: Vec<T>,
}

// [1]: The generic definition follows the struct name.
// [2]: The struct is generic over the type `T`.
// [3]: The `team_score` field is of type `T`.
// [4]: The `scores` field is of type `Vec<T>`.

When we assign values to the struct's parameters during use, we commit to the specific type for T. For the score, we can use any type, such as integer:

let int_score: Score<i32> = Score {
    user: "Lesley".to_string(),
    team_score: 1_641_213,
    scores: vec![5213, 1232, 1512],
};

println!(
    "Int score: user = {}, scores = {:?}",
    int_score.user, int_score.scores
);

// Prints:
// Int score: user = Lesley, scores = [5213, 1232, 1512]

Or float:

let flt_score = Score {
    user: "Simone".to_string(),
    team_score: 54.2526,
    scores: vec![4.2526],
};

println!(
    "Float score: user = {}, scores = {:?}",
    flt_score.user, flt_score.scores
);

// Prints:
// Float scores: user = Simone, scores = [4.2526]

We didn't specify the type of flt score; the compiler can infer the struct type from the values used.

Because scores elements are defined as T, the compiler ensures that the type of team score is the same as the type of scores elements. Using an integer for one and a float for another, for example, does not work:

let broken_score = Score {
    user: "Simone".to_string(),
    team_score: 100,
    scores: vec![4.2526],
};

error[E0308]: mismatched types
    |
    | scores: vec![4.2526],
    | ^^^^^^ expected integer, found floating-point number

Generic methods

Furthermore, we can use generic structs to implement a generic method. For example, we can implement the following method to obtain the most recent score (assuming that scores preserve the order):

// [1] [2]
impl<T> Score<T> {
// [3]
    fn last_score(&mut self) -> Option<&T> {
        self.scores.last()
    }
}

// [1]: The generic definition follows the `impl` keyword.
// [2]: The struct is generic over the type `T`.
// [3]: The function's return value is generic over `T`
// (It returns an optional reference to a value of type `T`).

Why are there multiple <T>?

After the impl keyword (impl<T>), we declare the type parameter (T), and then we use other Ts to refer to that first T: to specify the struct type we are implementing (Score<T>) or to specify the return type (Option<&T>).

When we have more complex types, the distinction between these becomes clearer; for example:

impl<T> Pair<T, T> { ... }


impl<T> MyStruct<Vec<T>> { ... }

We can use the last_score method with any type of score:

let int_score: Score<i32> = Score {
    user: "Lesley".to_string(),
    team_score: 1_641_213,
    scores: vec![5213, 1232, 1512],
};

let last_int_score = int_score.last_score();
println!("Last score: {:?}", last_int_score);
// Prints:
// Last score: Some(1512)

let flt_score = Score {
    user: "Simone".to_string(),
    team_score: 54.2526,
    scores: vec![4.2526],
};

let last_flt_score = flt_score.last_score();
println!("Last score: {:?}", last_flt_score);
// Prints:
// Last score: Some(4.2526)

How to constrain a generic type

We've already covered a few generics that accept any type: Vec<T>, Option<T>, and so on. This adaptability does not come cheap. We can't print, compare, or do anything else with the values of this type because we don't know anything about it.

In the previous examples, at least, we knew something about the type of container, allowing us to implement some functionality. What if all we have is type T? Consider a simple generic function that accepts one type of value and returns another type of value:

fn quick_adventure<T>(t: T) -> T

We don’t know anything about T. The only thing we can do is return the parameter’s value. Quite boring and limiting:

fn quick_adventure<T>(t: T) -> T {
    t
}

println!("{}", quick_adventure(17));
// Prints:
// 17

println!("{}", quick_adventure("out"));
// Prints:
// out

When we write a polymorphic function that works for multiple types, we limit the implementation possibilities. For example, if we write a function that works for a pair of integers, we can easily add or multiply. However, we cannot write a function that works for both a pair of integers and a pair of strings because strings do not support these operations. We can, however, use shared operations, such as comparing two elements, which work for both integer and string values.

Generics make code easier to read. A "very" generic function can be described in large part by the types themselves, which also serve as documentation.

Consider a more concrete example. We created a generic reverse function that reverses a vector while it is still in place. What if we change it and instead of updating the vector, we try to print the values?

fn reverse_print<T>(vector: &mut Vec<T>) {
    while let Some(last) = vector.pop() {
        println!("{}", last);
    }
}

It doesn’t compile!

error[E0277]: `T` doesn't implement `std::fmt::Display`
    |
    | println!("{}", last);
    | ^^^^ `T` cannot be formatted with the default formatter
    |
help: consider restricting type parameter `T`
    |
    | fn reverse_print<T: std::fmt::Display>(vector: &mut Vec<T>) {
    | +++++++++++++++++++

The compiler has no knowledge of the type. How should it be printed? The error message suggests that we restrict the type parameter. Spoiler alert, we can do this with trait bounds.

Recap on traits

We can define interfaces or shared behaviours on types using traits. To implement a trait for a type, we must first implement the trait's methods.

All types that implement a given trait must support the functionality that this trait defines. If we want to compare values of a certain type, it must implement PartialEq and PartialOrd:

#[derive(PartialEq, PartialOrd)]
struct LevelScore {
    main_score: i32,
    bonus_score: i32,
}

let my_score = LevelScore {
    main_score: 10,
    bonus_score: 5,
};

let your_score = LevelScore {
    main_score: 5,
    bonus_score: 2,
};

if my_score > your_score {
    println!("I have a bigger score!")
} else {
    println!("You have a bigger score!")
}
// Prints:
// I have a bigger score!

Otherwise, we would get an error:

// #[derive(PartialEq, PartialOrd)]
struct LevelScore {
    main_score: i32,
    bonus_score: i32,
}

error[E0369]: binary operation `>` cannot be applied to type `LevelScore`
    |
    | if my_score > your_score {
    | -------- ^ ---------- LevelScore
    | |
    | LevelScore
    |
note: an implementation of `PartialOrd<_>` might be missing for `LevelScore`
    |
    | struct LevelScore {
    | ^^^^^^^^^^^^^^^^^ must implement `PartialOrd<_>`
help: consider annotating `LevelScore` with `#[derive(PartialEq, PartialOrd)]`
    |
    | #[derive(PartialEq, PartialOrd)]
    |

A trait is a contract-based interface that defines behaviours on which other code can rely. We can create functions that accept types while also implementing our trait. These functions are not required to understand anything else about these types.

Trait bounds on generic functions

We can make certain promises about the type of behaviour based on traits. We can use traits and generics to constrain a generic type so that it accepts only types with a specific behaviour rather than any type. As an example, we can use a trait bound to restrict the generic type in reverse print:

use std::fmt::Display;

// [1] [2]
fn reverse_print<T: Display>(vector: &mut Vec<T>) {
    while let Some(last) = vector.pop() {
// [3]
        println!("{}", last);
    }
}

// The function is
// [1]: generic over `T` that is
// [2]: bound by the `Display` trait, which guarantees
// [3]: that values of type `T` can be printed.

The syntax for declaring a generic type parameter with a trait bound is <T: Display>. In this case, T is any type that implements the Display trait, which means it can be printed.

We can use reverse_print with both str vector and int vector because they implement the Display trait:

let mut str_vector: Vec<&str> = vec!["one", "two", "three"];
reverse_print(&mut str_vector);
// Prints:
// three
// two
// one

let mut int_vector: Vec<i32> = vec![1, 2, 3];
reverse_print(&mut int_vector);
// Prints:
// 3
// 2
// 1

But it doesn’t work with types that don’t implement Display:

struct LevelScore(i32, i32);

let mut score_vector = vec![LevelScore(10, 5), LevelScore(5, 20)];

reverse_print(&mut score_vector);

error[E0277]: `LevelScore` doesn't implement `std::fmt::Display`
    |
    | reverse_print(&mut score_vector);
    | ------------- ^^^^^^^^^^^^^^^^^ `LevelScore` cannot be formatted with the default formatter
    | |
    | required by a bound introduced by this call
    |
    = help: the trait `std::fmt::Display` is not implemented for `LevelScore`
note: required by a bound in `reverse_print`

Trait bounds on type parameters allow us to tell the compiler that we want the generic type to behave in a specific way.

Multiple trait bounds

We can specify multiple trait bounds. Let's create another vector-based generic function: print max should find and print the most significant element, which we can find by using the corresponding iterator and its method max:

use std::fmt::Display;

// [1]
fn print_max<T: Ord + Display>(elements: &[T]) {
    match elements.iter().max() {
        Some(max_value) => println!("Max value: {}", max_value),
        None => println!("Vector is empty"),
    }
}

// [1]: We use `+` to specify multiple traits.

Because we need to compare and print elements this time, we instruct the compiler that T can only be types that implement both Ord and Display.

Because the number of items remains constant, it is preferable to use a slice (&[T]) rather than a vector &Vec<T>. If you want to brush up on your skills, read The Slice Type in Rust.

There is an alternative syntax that helps to clean up the function signature. We can list traits after the function parameters instead of writing the trait bound with the type parameter declaration:

use std::fmt::Display;

fn print_max<T>(elements: &[T]) // [1]
where // [2]
    T: Ord + Display, // [3]
{
    match elements.iter().max() {
        Some(max_value) => println!("Max value: {}", max_value),
        None => println!("Vector is empty"),
    }
}

// [1]: The function signature is followed by
// [2]: `where` clause, where we 
// [3]: specify the trait bounds.

With either syntax, the print_max function works with any vector of elements, as long as their type implements both Ord and Display:

let mut int_vector = vec![10, 200, 3];
print_max(&int_vector);
// Prints:
// Max value: 200

let mut str_vector = vec!["Go", "Scrabble", "Battleship", "Monopoly"];
print_max(&str_vector);
// Prints:
// Max value: Scrabble

Trait bounds on other generics

Trait bounds aren’t limited to generic functions. We can restrict any type parameter using traits.

For example, we can use trait bounds on generic structs:

// [1]
struct GameScore<T: Ord> {
    name: String,
    scores: Vec<T>,
}

impl<T: Ord> GameScore<T> {
    fn high_score(&self) -> Option<&T> {
// [2]
        self.scores.iter().max()
    }
}

// [1]: We bound `T` with the `Ord` trait.
// [2]: So we can use `max` to return the maximum score.

let taylor_score = GameScore {
    name: "taylor".to_string(),
    scores: vec![10, 200, 3],
};

println!("{:?}", taylor_score.high_score());
// Prints:
// Some(200)

We don't have to limit the entire struct. We can limit implementations by conditionally implementing methods for types that implement specific traits. For example, we can define two GameScore methods: high score, which can be used when vector elements implement Ord, and reverse_print_scores, which can be used when vector elements implement Display.

use std::fmt::Display;

struct GameScore<T> {
    name: String,
    scores: Vec<T>,
}

impl<T: Ord> GameScore<T> {
    fn high_score(&self) -> Option<&T> {
        self.scores.iter().max()
    }
}

impl<T: Display> GameScore<T> {
    fn reverse_print_scores(&mut self) {
        while let Some(last) = self.scores.pop() {
            println!("{}", last);
        }
    }
}

In the case of integer vector, we can use both:

let mut taylor_score = GameScore {
    name: "taylor".to_string(),
    scores: vec![10, 200, 3],
};

println!("{:?}", taylor_score.high_score());
// Prints:
// Some(200)

taylor_score.reverse_print_scores();
// Prints:
// 3
// 200
// 10

But if we have a custom type that only implements Ord, we can only use one of the methods:

#[derive(PartialEq, PartialOrd, Eq, Ord, Debug)]
struct LevelScore(i32, i32);

let mut dale_score = GameScore {
    name: "Dale".to_string(),
    scores: vec![LevelScore(10, 5), LevelScore(5, 20)],
};

println!("{:?}", dale_score.high_score());
// Prints:
// Some(LevelScore(10, 5))

Calling the second method doesn’t compile because LevelScore doesn’t implement Display:

dale_score.reverse_print_scores();


error[E0599]: the method `reverse_print_scores` exists for struct `GameScore<LevelScore>`, but its trait bounds were not satisfied
    |
    | struct GameScore<T> {
    | ------------------- method `reverse_print_scores` not found for this
...
    | struct LevelScore(i32, i32);
    | ---------------------------- doesn't satisfy `LevelScore: std::fmt::Display`
...
    | dale_score.reverse_print_scores();
    | ^^^^^^^^^^^^^^^^^^^^ method cannot be called on `GameScore<LevelScore>` due to unsatisfied trait bounds
    |

Conclusion

Generics allow us to write code that is both more flexible and more restrictive. When we use generics, we can write generic code for multiple types without worrying about concrete type implementation.

We learned how to use generics to write less code, how to create generic functions, structs, enums, methods, or traits, and how to constrain generics with trait bounds.

This blog post explained how to use generics to achieve compile-time polymorphism (generic code). We can obtain a run-time polymorphism by using trait objects (similar to virtual functions in other languages). You can read about using trait objects in the Rust Book to allow functions to perform dynamic dispatch and return different types at run-time.

Lifetimes are another type of generic that informs the compiler about how references relate to one another. We can keep references valid for as long as we want. The chapter on Validating References with Lifetimes in the Rust Book is a good place to start.

I sincerely hope that the majority of you find the approach covered here to be helpful. Thank you for reading, and please feel free to leave any comments or questions in the comments section below.

Post a Comment

0 Comments