std::move

  • cpp
  • 10
  • 1
  • draft
Table of Contents

One of the most groundbreaking features of C++ is its ability to move objects, instead of copying them. It’s an optimization technique especially useful for managing temporary objects, for which copying is only a costly burden.

Let’s imagine a class Bar which uses Foo objects and thus must hold them. In C++03 this would look like this:

class Foo;

struct Bar
{
    explicit Bar(const Foo& f) : myFoo(f) {}
    Foo myFoo;
};

int main()
{
    Bar b(Foo("non-default-arg"));
}

We must unnecessarily copy the temporary Foo, which is is first created in main() and then copied in initializer list of Bar. The original object is discarded moments later, meaning that we just made a potentially costly operation for nothing. It would be much better to move data from such temporary object to their new destination.

Imagine it as moving to the new flat1. We just bought it, all boxes sit in the middle of the floor, we have nothing else to do but unpack, start a new life and leave all the unpleasant memories behind, right? Wrong. The old apartment is still there. Our friends don’t know our new address and will send letters to the old one. The old owner would like to rent it again, but he has to change the locks, because by the rule of thumb, he doesn’t trust his tenants.

Moving in C++ is similar. Once the object is moved, its source is still a valid object, but in an unspecified state. It might be possible to reuse it, and its destructor certaily will be invoked normally.

Rvalue Reference

From the perspective of moving the only safe objects to move are temporaries, or rvalues, which exist only during a single expression. An example of such temporary is constructor call not assigned to anything, or a result of some operation, like addition (foo1 + foo2 returns a third object, which is rvalue).

To distinguish when function receives a temporary, C++11 introduces a new reference type: rvalue reference, which is designated with a double ampersand &&. For example:

void foo(Foo&& temporary) {}

Notice lack of const before the reference type, which is quite common with ordinary references. rvalue references indicate objects which we want to move-from. By definition, they will be modified, so const would be useless.

Keep in mind that object passed by rvalue reference doesn’t have to be moved at all! Up to this point, it’s an ordinary reference, which only highlights a possibility of moving. We could as well copy such object or perform any other operation on it. I don’t advise using rvalue references universally for 2 reasons though:

  • it’s restricting: we can’t pass ordinary lvalues (non-temporaries) to arguments which accept rvalue references;
  • it’s against convention: if you don’t intend to move objects, use ordinary reference, because rvalue references were invented to explicitly pinpoint moving.

auto_ptr

Historical note why explicit moves required by C++ are important. Before C++172, a special smart pointer had existed, auto_ptr, which implemened a kind of move semantics by abusing copy constructor. Whenever it was assigned to another auto_ptr, both source and target objects were modified: source the pointer it managed and transfered its ownership to the target:

auto_ptr<std::string> source(new std::string("foo"));

// we expect a copy here, but source in fact no longer manages a pointer
auto_ptr<std::string> target = source;

This is very surprising behaviour which has been replaced by introducing move semantics in C++11. Instead of auto_ptr, smart pointers with well-defined and not surprising semantics, such as std::unique_ptr, should be used.

Moving objects

Move Constructor

Move constructor is a place where we decide how to handle moving of the object. It’s a new type of constructor introduced by C++11, with the following signature: Foo::Foo(Foo&& other).

To make moves fast, it is usually implemented by swapping pointers. If they are not swapped, or copied and set to nullptr in the original object, we will be actually copying data instead of moving. Only the type of copied values decides whether it is a deep or shallow copy4.

Let’s see an example of this technique:

template <typename T>
class Foo
{
    T* ptr;

public:
    Foo(Foo&& other)
    {
        using std::swap;
        swap(ptr, other.ptr);
    }
};

This way we use copy and swap idiom (see: 1, 2) to swap the uninitialised pointer held by constructed object, with a pointer held by other. After this operation, the other object is in unspecified state: the pointer is uninitialised so dereferencing it would lead to the undefined behaviour. In current state it is unsafe to use the original moved-from object, however, we could, implement a way to re-initialise this pointer:

  void reset(T& lhs) {
    ptr = &lhs;
  }

Implementing move by swapping pointers is extremely efficient and cheap. It is the only way to move data without making its copy (we don’t have this possibility with non-reference types).

Implicitly Delared Move Constructor

With legacy code it would be cumbersome, or sometimes straight impossible, to implement move constructors for all types. Fortunately, we can still move such types. As long as they meet certain requirements, the compiler will automatically generate a move constructor.

These requirements are lack of:

  • user-declared copy constructors;
  • user-declared copy assignment operators;
  • user-declared move assignment operators;
  • user-declared destructor.

Generation of implicitly declared move constructor may be forced, even if above requirements are not met, by using default keyword:

struct Foo {
  // normally, having a copy constructor prevents generation of move
  // constructor...
  Foo(const Foo&);

  // ...but we force it anyway
  Foo(Foo&&) = default;
};

On the other hand, we can prevent its generation by using delete keyword:

struct Foo {
  Foo(Foo&&) = delete;
};

Move Assignment Operator

Move assignment operator is very similar to move constructor, but it is used when we assign rvalue to an object, for example:

Foo foo;
foo = std::move(bar);

To define it we typically implement operator=(T&&):

template <typename T>
class Foo
{
    T* ptr;

public:
    Foo& operator=(Foo&& other)
    {
        using std::swap;
        swap(ptr, other.ptr);
        return *this;
    }
};

Implicitly Declared Move Assignment Operator

Similar to move constructor, the compiler will automatically generate move assignment operator if there is no:

  • user-declared copy constructors;
  • user-declared move constructors;
  • user-declared copy assignment operators;
  • user-declared destructor.

Similarily to move constructor, we can force compiler to generate it by using default keyword and prevent its generation by using delete keyword.

Using std::move to Move Lvalues

Being able to move only temporaries would be very limiting. There are times when we want to create object, perform some operations on it and then move it. Oe hold non-temporaries. Or move accepted rvalue further (when rvalue references are used as function parameters, they become lvalues inside these functions - they are used by their names after all).

In these cases we must use explicit call to std::move, which casts lvalues to rvalues, allowing the correct overload resolution for further passing of our objects.

More than that, std::move also serves a second purpose and I cannot emphasize this enough: it explicitly indicates that an object may be moved from. In case of ownership and resources handling explicit is better than implicitIt, so this a is huge improvement since C++03 (see auto_ptr).

Let’s consider our Foo class from the Move Constructor part, but let’s change it to hold values instead of pointers. Again, we want to move objects of type Foo<T>. If the type T supports moving itself, we could implement move very efficiently, completely sidestepping the default initialisation of held object:

template <typename T>
class Foo
{
    T obj;

public:
    Foo(Foo&& other): obj(std::move(other.obj)) {}
};

Alternatively, we could swap obj and other.obj, but it means unnecessary initialisation of obj. It might or might not concern us, because default initialisation doesn’t always equal to “performant”. The swap itself would be efficient though, because std::swap requires that swapped objects are MoveAssignable and MoveConstructible. On the other hand, custom reimplementations of swapping are usually provided when they are more efficient than the generic version, so we’re good here as well.

If the underlying type doesn’t support moving, there’s nothing we can do and we must copy an object inside the move constructor. Or implement moving anyway and require template type to support it, givin the compilation error if it doesn’t. This approach works if we write e.g. a library code.

Perfect Forwarding

A special case of rvalue reference are rvalue references to template parameters. It is called forwarding reference, which allows a perfect forwarding of arguments.

template <typename T>
void foo(T&& arg)
{
    std::forward<T>(arg);
}

Forwarding references preserve their value category (either rvalue or lvalue) when forwarded with std::forward. It means that, depending on the value category the argument had when originally passed to the function, it will be forwarded to the correct overload, accepting either rvalue or lvalue.

A lot of functions in standard library use this technique to construct objects in place, meaning that no copies or moves are involved, because all arguments are perfectly forwarded to appropriate constructors. A good example is std::make_unique, an utility function used for constructing unique_ptr. It accepts a template parameter pack, which is perfectly forwarded with a variadic version of std::forward to constructor of underlying type.

template<class T, class... U>
std::unique_ptr<T> make_unique(U&&... u)
{
    return std::unique_ptr<T>(new T(std::forward<U>(u)...));
}

The same purpose of forwarding reference serves also auto&&.

Avoiding Repetition

Don’t Repeat Yourself (DRY) is an important programming paradigm, which can be easily broken with all the new constructs. Copying and moving don’t differ that much after all. It resembles the situation when we’re creating const and non-const versions of our interface: fundamentally, all algorithms remain the same, but we need to copy-paste big chunks of code.

struct StringContainer {
  std::string s;

  StringContainer(const std::string& val) : s(val) {}
  StringContainer(std::string&& val) : s(std::move(val)) {}

  StringContainer(const StringContainer& other) : s(other.s) {}
  StringContainer(StringContainer&& other) : s(std::move(other.s)) {}

  StringContainer& operator=(const StringContainer& other) {
      s = other.s;
      return *this;
  }

  StringContainer& operator=(StringContainer&& other) {
      s = std::move(other.s);
      return *this;
  }
};

Can we do better than that?

Accepting Parameters by Value

Very simple, but powerful idea is to simply create functions which accept elements by value. We can move to these values and then move from them, totaling in 2 moves for passed rvalues and a copy and move for lvalues. However, compilers will happily elide5 an unnecessary copy, so in reality it will be even more performant, possibly resulting with only one move for lvalues.

struct StringContainer {
  std::string s;

  StringContainer(std::string val) : s(std::move(s)) {}
};

Accepting Forwarding References

Perfect forwarding is another meant to avoid a repetition, although not so obvious as it looks like.

The most straightforward implementation is this:

struct StringContainer {
  std::string s;

  template <typename T>
  StringContainer(T&& val) : s(std::forward<T>(val)) {}
};

Such constructor has one problem though: T can be any type, including StringContainer, making it unwanted (and incorrect) copy constructor. Such invocation wouldn’t compile of course, but to ensure that template type matches only one type, we must use SFINAE techique (Substitution Failure Is Not An Error)6. We can remove some overloads from the overload set of StringContainer constructors with std::enable_if:

template <
  typename T,
  std::enable_if_t<
    std::is_convertible_v<std::remove_cvref_t<T>, std::string>, int> = 0>
StringContainer(T&& val) : s{std::forward<T>(val)} {}

Above constructor is optimal in terms of number of copies and moves, but it’s not very straightforward, especially for the uninitiated, what this template actually says. Writing correct enable_if is an art by itself and it’s far from easy and memorable, so that’s why in such cases usually much simpler accepting by values is recommended.

Another take for restricting accepted types is using concepts feature of C++20:

template<class T>
concept IsString = std::is_convertible_v<T, std::string>;

template <IsString T>
StringContainer(T&& val) : s(std::forward<T>(val)) {}

The Rule of Four

What about duplicating of all copy and move constructors and assignment operators? That’s a lot of very similar code. This question brings us to The Rule of Four (and a half), which is an extension of The Rule of Three for objects which support moving.

In short it says that if we define any custom desctructor, copy constructor, copy assignment operator, move constructor or move assignment operator, we must declare all of them. And we do that by:

  • implementing a custom swap function (known also as copy-and-swap idiom - the half part);
  • implementing assignment operators in a single function which accepts parameters by value.

So let’s go back to our initial example of Foo<T>, which manages a pointer to T. For simplicity, the default constructor will only initialise it to the nullptr, which prevents undefined behaviour when deleting it7.

template <typename T>
struct Foo
{
    T* ptr;

    Foo() : ptr(nullptr) {}

    // copy constructor deep copies T, meaning that Foo<T> "owns" the data
    // and deleting it is safe. It "owns" in quotes because ptr is public.
    // Let's say that I'm not very good in creating examples :)
    Foo(const Foo& other) : ptr(new T(*other.ptr)) {}

    // I decided to default-initialise current object in move constructor and
    // swap it with moved-from object. It's not strictly necessary, but I 
    // did it due to personal hygiene. In this case it helps when desctructor
    // of `other` is called: it'll call delete nullptr.
    Foo(Foo&& other) : Foo()
    {
        swap(*this, other);
    }

    ~Foo() { delete ptr; }

    Foo& operator=(Foo other)
    {
        swap(*this, other);
        return *this;
    }

    friend void swap(Foo& lhs, Foo& rhs) {
        using std::swap;
        swap(lhs.ptr, rhs.ptr);
    }
};

We use swap function 3 times: twice in assignment operators (technically implemented by a single function, but we’re so proud that we count it twice) and in a move constructor. This saves us a lot of code duplication especially if we must handle more than one data member.

Technically, swap could be implemented as ordinary, non-friend, free function. I chose to befriend it with Foo to avoid typing template parameters in it.

Reusing moved-from objects

In the beginning I said that using moved-from object might give surprising (unspecified) results and now I say that someone might want to reuse it?

That’s right, but you must prepare your objects for such mechanics (i.e. provide appropriate interface). I usually don’t bother with such things in my custom types, but if you’re writing a library, for example, you might be tempted to do so.

unique_ptr

A good example of interface which is prepared for reuse is std::unique_ptr. Once moved-from, dereferencing unique_ptr would lead to the undefined behaviour, but the class provides an interface to re-initialise it with a different object:

void fn(std::unique_ptr<std::string>);

auto ptr = std::make_unique<std::string>("foobar");
fn(std::move(ptr));  // at this point ptr is in unspecified state
ptr = std::make_unique<std::string>("bazbaz");  // ptr is safe tu use again

unique_ptr also provides a reset() method which serves the same purpose.

Overloading Methods for Rvalue Objects

There are times when it’s desired to act differently on temporary objects than on their non-temporary counterparts. In C++ we can do it by overloading methods for rvalues and lvalues, which is denoted by adding either && or & at the end of method’s signature. && methods will be called only for rvalues and & methods – for lvalues.

It’s a rather obscure feature, which is not necessary most of the times, yet it’s still good to know about it. Here’s an example:

class Foo {
  std::string s{"foo"};

public:
  const std::string& foo() const & {
    std::cout << "foo&" << std::endl;
    return s;
  }

  std::string foo() const && {
    std::cout << "foo&&" << std::endl;
    return std::move(s);
  }
};

int main() {
    Foo foo;
    std::string slv = foo.foo();    // foo() &
    std::string srv = Foo().foo();  // foo() &&
}
OUTPUT:
foo&
foo&&

  1. It would be strange to copy yourself to the new flat though. 

  2. auto_ptr has been deprecated by C++11 and removed from the standard in C++17. 

  3. Set to nullptr

  4. We speak about shallow copy when pointers are copied between the objects. Deep copy, on the other hand, copies the underlying data. 

  5. Copy elision is mandatory since C++17. 

  6. The name is technical and confusing, but it is only a method of forcing a wanted overload resolution in template metaprogramming. If 

  7. In C++ it is safe to delete nullptr