Rust - Option and Result

Understanding optional values and error handling in Rust requires a thorough understanding of the Option and Result enums. I'm going to go over both of them in this article.

Introduction

To understand the Option and the Result, it is important to understand the following:

  • the enum in Rust
  • matching enum variants
  • the Rust prelude

The enum in Rust

Enums are useful for a variety of reasons. They're useful for things like safe input handling and adding context to types by naming a collection of variants. In Rust, both the Option and the Result are enumerations, also known as enums. Rust's enum is quite adaptable. It can hold a variety of data types such as tuples, structs, and others. You can also use enums to implement methods.

The Option and the Result are pretty straightforward, though. Let’s first look at an example enum:

enum Example {
    This,
    That,
}
let this = Example::This;
let that = Example::That;

In the preceding code, we define an enum called Example. This enum has two variants: This and That. Then we make two instances of the enum, this and that. Each is created with its own variant. It is important to note that an enum instance is always one of the variants. You can define a struct with all of its possible fields when you use a field struct. An enum is distinct in that you only assign one of the variants.

Displaying the enum variants

By default, the enum variants cannot be printed to the screen. Let's include strum macros in our scope. This makes it simple to derive 'Display' from the enum we defined, which we do above the enum definition using #[derive(Display)]:

use strum_macros::Display;

#[derive(Display)]
enum Example {
    This,
    That,
}

let this = Example::This;
let that = Example::That;

println!("Example::This contains: {}", this);
println!("Example::That contains: {}", that);

Now, we can use print to display the enum variant values to screen:

Example::This contains: This
Example::That contains: That

Matching enum variants

Using the match keyword, we can do pattern matching on enums. The following function takes the Example enum as an argument:

fn matcher(x: Example) {
    match x {
        Example::This => println!("We got This."),
        Example::That => println!("We got That."),
    }
}

We can pass the matcher function a value of the Example enum. The match inside the function will determine what is printed to the screen:

matcher(Example::This);
matcher(Example::That);

Running the above will print the following:

We got This.
We got That.

The Rust prelude

The Rust prelude, as described here, is included in all programmes. It is a collection of items that are automatically imported into all Rust programmes. The majority of the items in the prelude are traits that are frequently used. However, we also discover the following two items:

std::option::Option::{self, Some, None}
std::result::Result::{self, Ok, Err}

The first is the Option enum, which is defined as 'a type that expresses the presence or absence of a value'. The Result enum is defined as "a type for functions that may succeed or fail."

Because these types are so widely used, variants of them are also exported. Let's go over both types in greater depth.

The Option

The Option is available to us without having to lift a finger, thanks to the prelude. The alternative is as follows:

pub enum Option<T> {
    None,
    Some(T),
}

The preceding information indicates that Option<T> is an enum with two variants: None and Some (T). In terms of how it is used, the None can be thought of as 'nothing' and the Some(T) as'something'. The <T>-thing is an important thing that newcomers to Rust may not notice right away. The <T> indicates that the Option Enum is a generic Enum.

The Option is generic over type T.

'Generic over type T' is the enum. The 'T' could have been any letter; the 'T' is simply used as a convention when only one generic is involved.

So, what exactly do we mean when we say "the enum is generic over type T"? It means we can use it for anything. When we begin working with the Enum, we can (and must) replace 'T' with a concrete type. This can be of any kind, as demonstrated by the following:

let a_str: Option<&str> = Some("a str");
let a_string: Option<String> = Some(String::from("a String"));
let a_float: Option<f64> = Some(1.1);
let a_vec: Option<Vec<i32>> = Some(vec![0, 1, 2, 3]);

#[derive(Debug)]
struct Person {
    name: String,
    age: i32,
}

let marie = Person {
    name: String::from("Marie"),
    age: 2,
};

let a_person: Option<Person> = Some(marie);
let maybe_someone: Option<Person> = None;

println!(
    "{:?}\n{:?}\n{:?}\n{:?}\n{:?}\n{:?}",
    a_str, a_string, a_float, a_vec, a_person, maybe_someone
);

The above code outputs the following:

Some("a str")
Some("a String")
Some(1.1)
Some([0, 1, 2, 3])
Some(Person { name: "Marie", age: 2 })
None

The code demonstrates that the enum can be generic over both standard and custom types. Furthermore, when we define an enum as being of type x, it can still have the variant 'None'. So the Option is a fancy way of saying:

This can be of a type T value, which can be anything really, or it can be nothing. 

Matching on the Option

Because Rust does not support exceptions or null values, you will see the Option (and, as we will see later, the Result) everywhere.

Because the Option is an enum, we can use pattern matching to handle each variant separately:

let something: Option<&str> = Some("a String"); // Some("a String")
let nothing: Option<&str> = None;   // None

match something {
    Some(text) => println!("We go something: {}", text),
    None => println!("We got nothing."),
}

match nothing {
    Some(something_else) => println!("We go something: {}", something_else),
    None => println!("We got nothing"),
}

The above will output the following:

We go something: a String
We got nothing

Unwrapping the Option

Unwrap is frequently seen in use. This appeared to be a bit mysterious at first. Searching for it in the IDE yields some results:


If you're using VScode, you should know that pressing Ctrl + left mouse button at the same time will take you to the source code. In this case, it takes us to the definition of unwrap in option.rs:

pub const fn unwrap(self) -> T {
    match self {
        Some(val) => val,
        None => panic!("called `Option::unwrap()` on a `None` value"),
    }
}

Unwrap is defined in the impl<T> Option<T> block of option.rs. When we invoke it on a value, it attempts to 'unwrap' the value tucked into the Some variant. If it matches on'self,' 'val' is 'unwrapped' and returned if the Some variant is present. The panic macro is called if the 'None' variant is present:

let something: Option<&str> = Some("Something");
let unwrapped = something.unwrap(); 
println!("{:?}\n{:?}", something, unwrapped);
let nothing: Option<&str> = None;
nothing.unwrap();

The code above will result in the following:

Some("Something")
"Something"
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src\main.rs:86:17

Calling unwrap on an Option is quick and simple, but allowing your programme to panic and crash is neither elegant nor safe.

Option examples

Let’s look at some examples where you could use an Option.

Passing an optional value to a function

fn might_print(option: Option<&str>) {
    match option {
        Some(text) => println!("The argument contains the following value: '{}'", text),
        None => println!("The argument contains None."),
    }
}
let something: Option<&str> = Some("some str");
let nothing: Option<&str> = None;
might_print(something);
might_print(nothing);

This outputs the following:

The argument contains the following value: 'some str'
The argument contains None.

Having a function return an optional value

// Returns the text if it contains target character, None otherwise:
fn contains_char(text: &str, target_c: char) -> Option<&str> {
    if text.chars().any(|ch| ch == target_c) {
        return Some(text);
    } else {
        return None;
    }
}
let a = contains_char("Rust in action", 'a');
let q = contains_char("Rust in action", 'q');
println!("{:?}", a);
println!("{:?}", q);

We can safely assign the function's return to a variable and use match later to determine how to handle a 'None' return. The preceding code generates the following output:

Some("Rust in action")
None

Let's look at three different approaches to working with the Optional return.

The first, and most dangerous, would be simply calling unwrap:

let a = contains_char("Rust in action", 'a');
let a_unwrapped = a.unwrap();   
println!("{:?}", a_unwrapped);

The second, safer option, is to use a match statement:

let a = contains_char("Rust in action", 'a');
match a {
    Some(a) => println!("contains_char returned something: {:?}!", a),
    None => println!("contains_char did not return something, so branch off here"),
}

The third option is to capture the return of the function in a variable and use if let:

let a = contains_char("Rust in action", 'a');
if let Some(a) = contains_char("Rust in action", 'a') {
    println!("contains_char returned something: {:?}!", a);
} else {
    println!("contains_char did not return something, so branch off here")
}

Optional values inside a struct

We can also use the Option inside a struct. This might be useful in case a field may or may not have any value:

#[derive(Debug)]
struct Person {
    name: String,
    age: Option<i32>,
}

let marie = Person {
    name: String::from("Marie"),
    age: Some(2),
};

let jan = Person {
    name: String::from("Jan"),
    age: None,
};

println!("{:?}\n{:?}", marie, jan);

The above code outputs the following:

Person { name: "Marie", age: Some(2) }
Person { name: "Jan", age: None }

Real world example

The pop method for vectors is an example of how the Option is used in Rust. This method yields an Option<T>. The last element is returned by the pop-method. However, it is possible that a vector is empty. It should return None in that case. Another issue is that a vector can contain any type. In that case, it is more convenient to return Some (T). As a result, pop() returns Option<T>.

The pop method for the vec from Rust 1.53:

impl<T, A: Allocator> Vec<T, A> {
    // .. lots of other code
    pub fn pop(&mut self) -> Option<T> {
        if self.len == 0 {
            None
        } else {
            unsafe {
                self.len -= 1;
                Some(ptr::read(self.as_ptr().add(self.len())))
            }
        }
    }
    // lots of other code
}    

A trivial example where we output the result of popping a vector beyond the point where it is still containing items:

let mut vec = vec![0, 1];
let a = vec.pop();
let b = vec.pop();
let c = vec.pop();
println!("{:?}\n{:?}\n{:?}\n", a, b, c);

The above outputs the following:

Some(1)
Some(0)
None

The result

The Result enum is another important Rust construct. The Result, like the Option, is an enum. Result.rs contains the definition of the Result:

pub enum Result<T, E> {
    /// Contains the success value
    Ok(T),
    /// Contains the error value    
    Err(E),
}

The Result enum is generic over two types, denoted by the letters T and E. The T represents the OK variant, which denotes a successful outcome. The E represents the Err variant, which is used to express an error value. Because Result is more generic than E, it is possible to communicate various errors. There would only be one type of error if Result had not been generic over E. There is only one type of 'None' in Option. This would not leave much room for error values in our flow control or reporting.

As indicated before, the Prelude brings the Result enum as well as the Ok and Err variants into scope in the Prelude like so:

std::result::Result::{self, Ok, Err}

This means we can access Result, Ok and Err directly at any place in our code.

Matching on the Result

Begin by writing an example function that returns a Result. In the example function, we check to see if a string has a certain number of characters. The following is the function:

fn check_length(s: &str, min: usize) -> Result<&str, String> {
    if s.chars().count() >= min {
        return Ok(s)
    } else {
        return Err(format!("'{}' is not long enough!", s))
    }
}

It's not a very useful function, but it's simple enough to demonstrate the concept of returning a Result. The function accepts a string literal and counts the number of characters in it. The string is returned if the number of characters is equal to or greater than'min'. An error is returned if this is not the case. The Result enum is appended to the return. When the function returns, we specify the types that will be in the Result. If the string is sufficiently long, we return a string literal. If there is an error, we will return a String message. This describes the Result<&str, String>.

If s.chars().count() >= min, the check is done for us. If it is true, it returns the string wrapped in the Ok variant of the Result enum. Because the variants that comprise Result are also brought into scope, we can simply write Ok(s). As we can see, the else statement will result in an Err variant. In this case, it is a String with a message.

Let’s run the function and output the Result using dbg!:

let a = check_length("some str", 5);
let b = check_length("another str", 300);
dbg!(a); // Ok("some str",)
dbg!(b); // Err("'another str' is not long enough!",)

We can use a match expression to deal with the Result that our function returns:

let func_return = check_length("some string literal", 100);
let a_str = match func_return {
    Ok(a_str) => a_str,
    Err(error) => panic!("Problem running 'check_length':\n {:?}", error),
};
// thread 'main' panicked at 'Problem running 'check_length':
// "'some string literal' is not long enough!"'

Unwrapping the Result

Instead of using a match expression, there is a shortcut that you will encounter frequently. This shortcut refers to the unwrap method defined for the Result enum. The procedure is as follows:

impl<T, E: fmt::Debug> Result<T, E> {
    ...
    pub fn unwrap(self) -> T {
        match self {
            Ok(t) => t,
            Err(e) => unwrap_failed("called `Result::unwrap()` on an `Err` value", &e),
        }
    }
    ...
}

The value 'Ok' was returned by calling unwrap. Unwrap will panic if there is no 'Ok' value. The from str method returns a 'Ok' value in the following example:

use serde_json::json;
let json_string = r#"
{
    "key": "value",
    "another key": "another value",
    "key to a list" : [1 ,2]
}"#;
let json_serialized: serde_json::Value = serde_json::from_str(&json_string).unwrap();
println!("{:?}", &json_serialized);
// Object({"another key": String("another value"), "key": String("value"), "key to a list": Array([Number(1), Number(2)])})

'json serialized' contains the value that was wrapped in the 'Ok' variant, as we can see.

The following example shows what happens when we use unwrap on a function that does not return a 'Ok' variant. On invalide JSON, we use'serde json::from str':

use serde_json::json;
let invalid_json = r#"
{
    "key": "v
}"#;

let json_serialized: serde_json::Value = serde_json::from_str(&invalid_json).unwrap();
/*
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("control character (\\u0000-\\u001F) found while parsing a string", line: 4, column: 0)',
*/

There is a panic and the program comes to a halt. Instead of unwrap, we could also choose to use expect.

use serde_json::json;

let invalid_json = r#"
{
    "key": "v
}"#;
let b: serde_json::Value =
    serde_json::from_str(&invalid_json).expect("unable to deserialize JSON");

This time, when we run the code, we can also see the message that we added to it:

thread 'main' panicked at 'unable to deserialize JSON: Error("control character (\\u0000-\\u001F) found while parsing a string", line: 4, column: 0)'

Because unwrap and expect cause panic, the programme is terminated. Unwrap is most commonly used in the example section, where the emphasis is on the example and a lack of context prevents proper error handling for a specific scenario. Unwrap is most commonly found in the example section, code comments, and documentation examples. Consider the following example to have serde serialise fields as camelCase:

use serde::Serialize;

#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct Person {
    first_name: String,
    last_name: String,
}

fn main() {
    let person = Person {
        first_name: "Graydon".to_string(),
        last_name: "Hoare".to_string(),
    };

    let json = serde_json::to_string_pretty(&person).unwrap();  // <- unwrap

    // Prints:
    //
    //    {
    //      "firstName": "Graydon",
    //      "lastName": "Hoare"
    //    }
    println!("{}", json);
}

Using ? and handling different errors

Various projects frequently define their own errors. Searching a repository for terms such as pub struct Error or pub enum Error can occasionally reveal the errors defined for a project. However, different crates and projects may return different error types. When you have a function that uses methods from multiple projects and you want to propagate an error, things can get a little more complicated. There are several approaches to this. Consider an example in which we deal with this by 'Boxing' the error.

In the following example, we define a function that reads the entire contents of the target file as a string, serialises it into JSON, and maps it to a struct. A Result is returned by the function. The 'Person' struct is the Ok variant, and the Error that will be propagated can be a serde or std::fs error. We return Result<Person, Box<dyn Error>> to be able to return errors from both of these packages. The 'Person' variant of the Result is acceptable. Box<dyn Error>, which represents 'any type of error,' is defined as the Err variant.

Another feature of the following example worth mentioning is the use of? We will read a file as a string using fs::read to string(s), and then serialise the text to a struct using erde json::from str(&text). We place the? after the call to those methods to avoid having to write match arms for the Results returned by those methods. If the preceding Result contains an Ok, this syntactic sugar will perform an unwrap. If the preceding Result contains an Err variant, this Err is returned exactly as if the return keyword had been used to propagate the error. When an error is returned, our'Box' will catch it.

The example code:

use serde::{Deserialize, Serialize};
use std::error::Error;
use std::fs;

// Debug allows us to print the struct.
// Deserialize and Serialize adds decoder and encoder capabilities to the struct.
#[derive(Debug, Deserialize, Serialize)]
struct Person {
    name: String,
    age: usize,
}

fn file_to_json(s: &str) -> Result<Person, Box<dyn Error>> {
    let text = fs::read_to_string(s)?;
    let marie: Person = serde_json::from_str(&text)?;
    Ok(marie)
}

let x = file_to_json("json.txt");
let y = file_to_json("invalid_json.txt");
let z = file_to_json("non_existing_file.txt");

dbg!(x);
dbg!(y);
dbg!(z);

The first time we call the function, it succeeds and dbg!(x); outputs the following:

[src\main.rs:20] x = Ok(
    Person {
        name: "Marie",
        age: 2,
    },
)

The second calls encounters an error. The file contains the following:

{
"name": "Marie",
"a
}

This file can be opened and read to a String, but Serde cannot parse it as JSON. This function call outputs the following:

[src\main.rs:21] y = Err(
    Error("control character (\\u0000-\\u001F) found while parsing a string", line: 3, column: 4),     
)

We can see the Serde error was properly propagated.

The last function call tried to open a file that does not exist:

[src\main.rs:22] z = Err(
    Os {
        code: 2,
        kind: NotFound,
        message: "The system cannot find the file specified.",
    },
)

That error, coming from std::fs, was also properly propagated.

Conclusion

Understanding how the Option and Result are used in Rust is critical. The above explanation of Rust's Option, Result, and error handling is my written account of how I discovered them. I hope this article is useful to others.

I sincerely hope that the majority of you find the approach covered here to be helpful. Thank you for reading, and please feel free to leave any comments or questions in the comments section below.

Post a Comment

0 Comments