CS100 Lecture 23
More on STL: Sequence containers and associative containers
Contents
More on STL: Sequence containers and associative containers
- Overview of STL
- Sequence containers
- Associative containers
Overview of STL
Standard Template Library
Added into C++ in 1994.
- Containers
- Iterators (In Lecture 20)
- Algorithms (In Lecture 20)
- Function objects
- Some other adaptors, like container adaptors and iterator adaptors
- Allocators
The next generation: C++20 Ranges
Containers
- Sequence containers
vector
,list
,deque
,array
(since C++11),forward_list
(since C++11)- Associative containers
set
,map
,multiset
,multimap
(often implemented with binary search trees)- Unordered associative containers (since C++11)
unordered_set
,unordered_map
,unordered_multiset
,unordered_multimap
(implemented with hash tables)- Container adaptors: provide a different interface for sequential containers, but they are not containers themselves.
stack
,queue
,priority_queue
- (since C++23)
flat_set
,flat_map
,flat_multiset
,flat_multimap
Iterators
Without iterators:
- Traverse an array
cpp for (int i = 0; i != sizeof(a) / sizeof(a[0]); ++i) do_something(a[i]);
- Traverse a
vector
cpp for (std::size_t i = 0; i != v.size(); ++i) do_something(v[i]);
- Traverse a linked-list?
cpp for (ListNode *p = l.head(); p; p = p->next) do_something(p->data);
Iterators
A generalization of pointers, used to access elements in different containers in a uniform manner.
With iterators:
The following works no matter whether c
is an array, a std::string
, or any container.
for (auto it = std::begin(c); it != std::end(c); ++it)
do_something(*it);
Equivalent way: range-based for loops
for (auto &x : c) do_something(x);
Algorithms
The algorithms library defines functions for a variety of purposes: - searching, sorting, counting, manipulating, ...
Examples:
// assign every element in `a` with the value `x`.
std::fill(a.begin(), a.end(), x);
// sort the elements in `b` in ascending order.
std::sort(b.begin(), b.end());
// find the first element in `b` that is equal to `x`.
auto pos = std::find(b.begin(), b.end(), x);
// reverse the elements in `c`.
std::reverse(c.begin(), c.end());
Algorithms
Example: Obtain the rank of each number in a sequence.
auto getRank(const std::vector<int> &data) {
auto tmp = data;
std::sort(tmp.begin(), tmp.end()); // sort
auto pos = std::unique(tmp.begin(), tmp.end()); // drop duplicates
auto ret = data;
for (auto &x : ret)
x = std::lower_bound(tmp.begin(), pos, x) - tmp.begin(); // binary search
return ret;
}
Function objects
Things that look like "functions": Callable
- functions, and also function pointers
- objects of a class type that has an overloaded operator()
(the function-call operator)
- lambda expressions
The standard library has defined some common function objects: std::less
, std::greater
, ...
std::sort(a.begin(), a.end(), std::greater<>{}); // Sort in descending order.
Adaptors
Container adaptors: std::stack
, std::queue
, std::priority_queue
- Represent the stack, queue and the priority-queue data structures respectively.
- They are not containers themselves. They are based on some underlying container, and provide the interfaces of the corresponding data structures.
std::stack<int> stk; // By default, uses `std::deque<int>` as
// the underlying container.
std::stack<int, std::vector<int>> stk2; // Uses `std::vector<int>`.
Iterator adaptors: To be discussed in recitations.
Sequence containers
Note: std::string
is not treated as a container but behaves much like one.
Sequence containers
std::vector<T>
: dynamic contiguous array (we are quite familiar with)
std::deque<T>
: double-ended queue (often pronounced as "deck")std::deque<T>
supports fast insertion and deletion at both its beginning and its end. (push_front
,pop_front
,push_back
,pop_back
)
std::array<T, N>
: same asT[N]
,but it is a container- It will never decay to
T *
. - Container interfaces are provided:
.at(i)
,.front()
,.back()
,.size()
, ..., as well as iterators.
Sequence containers
std::list<T>
: doubly-linked liststd::list<T>
supports fast insertion and deletion anywhere in the container,- but fast random access is not supported (i.e. no
operator[]
). - Bidirectional traversal is supported.
std::forward_list<T>
: singly-linked list- Intended to save time and space (compared to
std::list
). - Only forward traversal is supported.
Interfaces
STL containers have consistent interfaces. See here for a full list.
Element access:
c.at(i)
,c[i]
: access the element indexedi
.at
performs bounds checking, and throwsstd::out_of_range
ifi
exceeds the valid range.c.front()
,c.back()
: access the front/back element.
Interfaces
Size and capacity: c.size()
and c.empty()
are what we already know.
c.resize(n)
,c.resize(n, x)
: adjust the container to be with exactlyn
elements. Ifn > c.size()
,n - c.size()
elements will be appended.c.resize(n)
: Appended elements are value-initialized.c.resize(n, x)
: Appended elements are copies ofx
.c.capacity()
,c.reserve(n)
,c.shrink_to_fit()
: only forstring
andvector
.c.capacity()
returns the capacity (number of elements that can be stored in the current storage)c.reserve(n)
: reserves space for at leastn
elements.c.shrink_to_fit()
: requests to remove the unused capacity, so thatc.capacity() == c.size()
.
Interfaces
Modifiers:
c.push_back(x)
,c.emplace_back(args...)
,c.pop_back()
: insert/delete elements at the end of the container.c.push_front(x)
,c.emplace_front(args...)
,c.pop_front()
: insert/delete elements at the beginning of the container.c.clear()
removes all the elements inc
.
Interfaces
Modifiers:
c.insert(...)
,c.emplace(...)
,c.erase(...)
: insert/delete elements at a specified location.- Warning: For containers that need to maintain contiguous storage (
string
,vector
,deque
), insertion and deletion somewhere in the middle can be very slow (\(O(n)\)). - These functions have a lot of overloads. Remember a few common ones, and STFW (Search The Friendly Web) when you need to use them.
Interfaces
Some of these member functions are not supported on some containers, depending on the underlying data structure. For example:
- Any operation that modifies the length of the container is not allowed for array
.
- push_front
, emplace_front
and pop_front
are not supported on string
, vector
and array
.
- size
is not supported on forward_list
in order to save time and space.
- operator[]
and at
are not supported on linked-lists.
This table tells you everything.
Iterators
Every container has its iterator: Container::iterator
. e.g. std::vector<int>::iterator
, std::forward_list<std::string>::iterator
auto
comes to our rescue!
c.begin()
returns the iterator to the first element of c
.
c.end()
returns the iterator to the element following the last element of c
.
Iterator categories
ForwardIterators: supports *it
, it->mem
, ++it
, it++
, it1 == it2
, it1 != it2
BidirectionalIterator: a ForwardIterator that can be moved in both directions
- supports --it
and it--
.
RandomAccessIterator: a BidirectionalIterator that can be moved to point to any element in constant time.
- supports it + n
, n + it
, it - n
, it += n
, it -= n
for an integer n
.
- supports it[n]
, equivalent to *(it + n)
.
- supports it1 - it2
, returns the distance of two iterators.
- supports <
, <=
, >
, >=
.
Iterator categories
ForwardIterators: an iterator that can be moved forward.
- forward_list<T>::iterator
BidirectionalIterator: a ForwardIterator that can be moved in both directions
- list<T>::iterator
RandomAccessIterator: a BidirectionalIterator that can be moved to point to any element in constant time.
- string::iterator
, vector<T>::iterator
, deque<T>::iterator
, array<T,N>::iterator
Iterator categories
To know the category of an iterator of a container, consult its type alias member iterator_category
.
using vec_iter = std::vector<int>::iterator;
using category = vec_iter::iterator_category;
Put your mouse on category
, and the IDE will tell you what it is.
It is one of the following tags: std::forward_iterator_tag
, std::bidirectional_iterator_tag
, std::random_access_iterator_tag
.
Note: Two other categories InputIterator and OutputIterator will be discussed in recitations.
Constructors of containers
All sequence containers can be constructed in the following ways:
Container c(b, e)
, where[b, e)
is an iterator range.- Copies elements from the iterator range
[b, e)
. Container c(n, x)
, wheren
is a nonnegative integer andx
is a value.- Initializes the container with
n
copies ofx
. Container c(n)
, wheren
is a nonnegative integer.- Initializes the container with
n
elements. All elements are value-initialized. - This is not supported by
string
. (Why?)
Constructors of containers
All sequence containers can be constructed in the following ways:
Container c(b, e)
, where[b, e)
is an iterator range.- Copies elements from the iterator range
[b, e)
. Container c(n, x)
, wheren
is a nonnegative integer andx
is a value.- Initializes the container with
n
copies ofx
. Container c(n)
, wheren
is a nonnegative integer.- Initializes the container with
n
elements. All elements are value-initialized. - This is not supported by
string
, because it is meaningless to haven
value-initializeschar
s (all of them will be'\0'
)!
Associative containers
Motivation: set
Represent a "set": - Quick insertion, lookup and deletion of elements. - Order does not matter.
Sequence containers do not suffice:
- Lookup of elements is \(O(n)\).
- Quick insertion/deletion only happens at certain positions for some containers.
- e.g. vector
only supports quick insertion/deletion at the end.
- The order of elements is preserved, which is not important.
You will learn the appropriate data structures in CS101.
std::set
Defined in <set>
.
std::set<T>
is a set whose elements are of typeT
.operator<(const T, const T)
should be supported, because it is usually implemented as Red-black trees.std::set<T, Cmp>
is also available.x < y
will be replaced withcmp(x, y)
, wherecmp
is a function object of typeCmp
.
std::set<int> s1; // An empty set of ints
std::set<std::string> s2{"hello", "world"}; // A set of strings,
// initialized with two elements
struct Student { std::string name; int id; };
std::set<Student> s3; // No operator< for Student is available.
// This line alone does not cause error, but you cannot
// insert elements into it.
s3.insert(Student{"Alice", 42}); // Error: No operator< available.
std::set
Defined in <set>
.
std::set<T>
is a set whose elements are of typeT
.operator<(const T, const T)
should be supported, becauseset
is usually implemented as Red-black trees.std::set<T, Cmp>
is also available.x < y
will be replaced withcmp(x, y)
, wherecmp
is a function object of typeCmp
.
struct Student { std::string name; int id; };
struct CmpStudentByName {
bool operator()(const Student &a, const Student &b) const {
return a.name < b.name;
}
};
std::set<Student, CmpStudentByName> students; // OK
students.insert(Student{"Alice", 42}); // OK
std::set
Constructors
std::set<Type> s1{a, b, c, ...};
std::set<Type> s2(begin, end); // An iterator range [begin, end)
C++17 CTAD (Class Template Argument Deduction) also applies:
std::set s1{a, b, c, ...}; // Element type is deduced according to the list
std::set s2(begin, end); // Element type is deduced according to
// the type of elements pointed by `begin` and `end`.
Besides, std::set
is copy-constructible, copy-assignable, move-constructible and move-assignable, just as the sequence containers we have learned.
std::set
does not contain duplicate elements. These constructors will ignore duplicate elements.
std::set
: operations
Common operations: s.empty()
, s.size()
, s.clear()
.
Insertion: insert
and emplace
. Duplicate elements will not be inserted.
- s.insert(x)
, s.insert({a, b, ...})
, s.insert(begin, end)
.
std::set s{3, 2, 5, 5, 1}; // {1, 2, 3, 5}. The duplicate 5 is removed.
std::cout << s.size() << std::endl; // 4
s.insert(42); // {1, 2, 3, 5, 42}
s.insert(42); // Nothing is inserted. (No errors.)
int a[]{10, 20, 30};
s.insert(a, a + 3); // An iterator range.
// s now contains {1, 2, 3, 5, 10, 20, 30, 42}.
s.insert({11, 12}); // {1, 2, 3, 5, 10, 11, 12, 20, 30, 42}.
std::set
: insertion
Insertion: insert
and emplace
. Duplicate elements will not be inserted.
- s.emplace(args...)
. Forwards the arguments args...
to the constructor of the element type, and constructs the element in-place.
std::set<std::string> s;
s.emplace(10, 'c'); // inserts a string "cccccccccc"
s.insert(x)
and s.emplace(args...)
returns std::pair<iterator, bool>
:
- On success, .first
is an iterator
pointing to the inserted element, and .second
is true
.
- On failure, .first
is an iterator
pointing to the element that prevented the insertion, and .second
is false
.
std::set
: iterators
s.begin()
, s.end()
: Begin and off-the-end iterators.
The iterator of std::set
is BidirectionalIterator:
- Supports *it
, it->mem
, ++it
, it++
, --it
, it--
, it1 == it2
, it1 != it2
.
The elements are in ascending order: The following assertion always succeeds (if both tmp
and iter
are dereferenceable).
auto tmp = iter;
++iter;
assert(*tmp < *iter);
std::set
: iterators
Elements in a set
cannot be modified directly: *iter
returns a reference-to-const
.
- The elements are stored in specific positions in the red-black tree, according to their values.
- You cannot change their values arbitrarily.
std::set
: traversal
Range-for still works!
std::set<int> s{5, 5, 7, 3, 20, 12, 42};
for (auto x : s)
std::cout << x << ' ';
std::cout << std::endl;
Output: 3, 5, 7, 12, 20, 42
. The elements are in ascending order.
Equivalent way: Use iterators
for (auto it = s.begin(); it != s.end(); ++it)
std::cout << *it << ' ';
std::cout << std::endl;
std::set
: deletion
Delete elements: erase
- s.erase(x)
, s.erase(pos)
, s.erase(begin, end)
, where pos
is an iterator pointing to some element in s
, and [begin, end)
is an iterator range in s
.
- s.erase(x)
removes the element that is equivalent to x
, if any.
- returns 0
or 1
, indicating the number of elements removed.
std::set<int> s{5, 5, 7, 3, 20, 12, 42};
std::cout << s.erase(42) << std::endl; // 42 is removed. output: 1
// s is now {3, 5, 7, 12, 20}.
s.erase(++++s.begin()); // 7 is removed.
std::set
: element lookup
s.find(x)
, s.count(x)
, and some other functions.
s.find(x)
returns an iterator pointing to the element equivalent to x
(if found), or s.end()
(if not found).
std::set<int> s = someValues();
if (s.find(x) != s.end()) // x is found
// ...
std::set
: pros and cons
The time complexity of insertion, deletion, and lookup of elements in a std::set
: logarithmic in the size of the container. (\(O(\log n)\))
- Compared to sequence containers, this is (almost) a huge improvement.
Elements are sorted automatically.
Fast random access like v[i]
is not supported.
Other kinds of sets:
Sets based on red-black trees:
- std::set
- std::multiset
: allows duplicate elements
Sets based on hash-tables: (since C++11)
- std::unordered_set
: hash-table version of std::set
- std::unordered_multiset
: allows duplicate elements
Sets based on hash-tables provides (average-case) \(O(1)\) time operations, but requires the data to be hashable.
Motivation: map
Represent a map: \(f:S\to T\).
- For sequence containers
Container<Type>
: \(S=\{0,1,2,\cdots,N-1\}\) (index), \(T\) is the set of values of typeType
. - For
std::set<Type>
: \(T=\{\text{exist}, \text{not-exist}\}\), \(S\) is the set of values of typeType
.
std::map<Key, Value>
: defined in <map>
- Key
is the type of elements in \(S\), and Value
is the type of elements in \(T\).
- Stores "key-value" pairs.
Motivation: map
Example: Count the occurrences of strings.
std::map<std::string, int> counter; // maps every string to an integer
std::string word;
while (std::cin >> word)
++counter[word]; // !!
Now for any string str
, counter[str]
is an integer indicating how many times str
has occurred.
std::map
: comparison with std::set
std::map<Key, Value>
has two template parameters: Key
and Value
.
- If we ignore Value
, it is a std::set<Key>
.
- Duplicate keys are not allowed.
- operator<(const Key, const Key)
is required.
- Elements are stored in ascending order of keys.
- Keys cannot be modified directly.
- The element type of std::map<Key, Value>
is std::pair<const Key, Value>
.
- *iter
returns std::pair<const Key, Value> &
.
std::map
: comparison with std::set
Constructors:
- std::map<Key, Value> m{{key1, value1}, {key2, value2}, ...};
- std::map<Key, Value> m(begin, end)
, but the elements should be pairs:
cpp
std::vector<std::pair<int, int>> v{{1, 2}, {3, 4}};
std::map<int, int> m(v.begin(), v.end());
Insertion:
- m.insert({key, value})
- m.insert({{key1, value1}, {key2, value2}, ...})
- m.insert(begin, end)
std::map
: comparison with std::set
Deletion:
- m.erase(pos)
, m.erase(begin, end)
: same as std::set<T>::erase
.
- m.erase(key)
: Removes the element whose key is key
.
Iterators: BidirectionalIterator, pointing to std::pair<const Key, Value>
.
std::map<std::string, int> counter = someValues();
for (auto it = counter.begin(); it != counter.end(); ++it)
std::cout << it->first << " occurred " << it->second << " times.\n";
std::map
: traversal
Use range-for:
for (const auto &kvpair : counter)
std::cout << kvpair.first << " occurred " << kvpair.second << " times.\n";
It's so annoying to deal with the pair
stuff...
std::map
: traversal
Use range-for:
for (const auto &kvpair : counter)
std::cout << kvpair.first << " occurred " << kvpair.second << " times.\n";
It's so annoying to deal with the pair
stuff...
C++17 structured binding kills the game!
for (const auto &[word, occ] : counter)
std::cout << word << " occurred " << occ << " times.\n";
(Looks very much like Python unpacking.)
std::map
-specific: operator[]
m[key]
finds the key-value pair whose key is equivalent to key
.
- If such key does not exist, inserts {key, Value{}}
- the value is value-initialized.
- Then, returns reference to the value.
std::map<std::string, int> counter;
std::string word;
while (std::cin >> word)
++counter[word]; // If `word` does not exist in `counter`,
// a pair {word, 0} is inserted first.
std::map
: element lookup
m.find(key)
, m.count(key)
, and some other member functions.
Note: m.find(key)
does not insert elements. m[key]
will insert an element if that key does not exist.
Other kinds of maps:
Maps based on red-black trees:
- std::map
- std::multimap
: allows duplicate keys
Maps based on hash-tables: (since C++11)
- std::unordered_map
: hash-table version of std::map
- std::unordered_multimap
: allows duplicate keys
Maps based on hash-tables provides (average-case) \(O(1)\) time operations, but requires the key to be hashable.
Summary
Sequence containers
std::vector<T>
: dynamic contiguous array (we are quite familiar with)
std::deque<T>
: double-ended queue (often pronounced as "deck")
std::array<T, N>
: same asT[N]
, but it is a containerstd::list<T>
: doubly-linked list
std::forward_list<T>
: singly-linked list
Summary
Associative containers
std::set<T>
: A finite set \(\{e_1,e_2,\cdots,e_n\}\) where elements are of typeT
.std::map<Key, Value>
: A map \(f:S\mapsto T\), where \(S\) and \(T\) are the sets of values of typeKey
andValue
respectively.std::set
andstd::map
are ordered:T
andKey
need to have an ordering, either in the form ofoperator<
or some user-supplied comparator.std::unordered_set
andstd::unordered_map
are unordered and hash-based.