Hi everyone 
In this tutorial we will talk about Set
So, What is Set actually?
Set is an associative container of C++ STL that stores only unique elements. Each element inside a set is a key — there are no associated values (that is the job of map). The elements are always kept sorted in ascending order automatically.
Think of set as map without the values — just the keys. Or think of it as a sorted container that silently drops duplicates for us.
Since set stores only unique keys, it does not have the [ ] subscript operator or the at() method. Set is not about looking up values by keys — it is about answering the question “is this element present?” or iterating through the elements in sorted order.
Set’s underlying data structure is a Self-Balancing Binary Search Tree (usually a Red-Black Tree). That is why all its major operations — insertion, deletion, search — take O(log n) time, and that is also why the elements stay sorted automatically.
Header File Inclusion for set
#include <set>
Declaration & Initialization
Declaring an empty set —
set <int> numbers;
The set is empty right now.
Declaring a set with values —
From C++11 onwards, we can initialize a set with values using curly braces —
set <int> numbers = { 50, 20, 40, 10, 30 };
After Initialization numbers will look like this —
| 10 | 20 | 30 | 40 | 50 |
Notice that even though we inserted the values in the order 50, 20, 40, 10, 30, set has sorted them in ascending order.
Declaring a set with duplicate values —
If we try to initialize a set with duplicate values, the duplicates are silently dropped —
set <int> numbers = { 10, 20, 30, 20, 40, 10, 50 };
After Initialization numbers will look like this —
| 10 | 20 | 30 | 40 | 50 |
The duplicate 10 and 20 were ignored. This is one of the most common uses of set — to deduplicate a collection of values.
Declaring sets of different types —
Element type can be anything that supports the < operator —
set <double> scores;
set <string> nameList;
set <char> letters;
set <pair<int, int> > points;
Declaring a set in descending order —
By default, set stores elements in ascending order. If we want elements in descending order, we can pass greater<> as the second template parameter —
set <int, greater<int> > reverseSet = { 10, 20, 30, 40, 50 };
After Initialization reverseSet will look like this —
| 50 | 40 | 30 | 20 | 10 |
Insertion
Insertion using insert() method
The main way to add elements to a set is the insert() method —
set <int> numbers;
numbers.insert(20);
numbers.insert(50);
numbers.insert(10);
numbers.insert(40);
numbers.insert(30);
After Insertion numbers will look like this —
| 10 | 20 | 30 | 40 | 50 |
Inserting a duplicate value
If we try to insert a value that already exists, the insertion is silently ignored —
numbers.insert(30);
numbers will still look like this —
| 10 | 20 | 30 | 40 | 50 |
💡 Quick tip on insert() return value — insert() actually returns a
pair<iterator, bool>. The bool tells us whether the insertion actually happened (true) or the value was already there (false). This is useful when we want to check for duplicates during insertion.auto result = numbers.insert(30); if (result.second) { cout << "Inserted successfully" << endl; } else { cout << "Already exists" << endl; }
Insertion using emplace() method
emplace() constructs the element in place, avoiding an extra copy —
numbers.emplace(60);
numbers.emplace(70);
After this, numbers will look like this —
| 10 | 20 | 30 | 40 | 50 | 60 | 70 |
Insertion using insert() and emplace_hint() with a position hint
Both insert() and emplace() have overloads that take an iterator as a position hint — a suggestion for where the new element should go —
set <int> numbers = { 10, 20, 40, 50 };
auto it = numbers.find(40); // points to 40
numbers.insert(it, 30); // hint: 30 belongs just before 40
numbers.emplace_hint(it, 45); // same idea, constructed in place
After this, numbers will look like this —
| 10 | 20 | 30 | 40 | 45 | 50 |
💡 Note on hints — The hint is only a suggestion, and the element still lands in its correct sorted position regardless. But if the hint points to the element just after where the new one belongs, the insertion drops from O(log n) to amortized O(1). This is genuinely useful when inserting many already-sorted elements one by one — passing the previous iterator as the hint each time keeps every insert cheap.
Inserting a range of values
We can also insert a range of values from another container —
vector <int> extra = { 80, 90, 100 };
numbers.insert(extra.begin(), extra.end());
After this, numbers will look like this —
| 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
Accessing the element
Set does not support [ ] or at(). Access happens through iterators returned by find(), or by iterating through the set directly.
find() method
find() returns an iterator to the element if it exists, otherwise it returns end() —
auto it = numbers.find(30);
if (it != numbers.end())
{
cout << "Found " << *it << endl;
}
else
{
cout << "Not found" << endl;
}
Output will be – Found 30
Checking if an element exists
count() method
Since set elements are unique, count() returns either 0 (not present) or 1 (present) —
if (numbers.count(30))
{
cout << "30 is in the set" << endl;
}
contains() method (C++20)
From C++20 onwards, we have a cleaner way —
if (numbers.contains(30))
{
cout << "30 is in the set" << endl;
}
equal_range() method
equal_range() returns a pair of iterators marking the range of elements equal to a value. Since a set holds each value at most once, this range is either empty (value absent) or contains exactly one element —
set <int> numbers = { 10, 20, 30, 40, 50 };
auto range = numbers.equal_range(30);
if (range.first != range.second)
{
cout << "Found " << *range.first << endl;
}
Output will be – Found 30
💡 Note on equal_range() in set vs multiset — In a set, equal_range() can cover at most one element, so find() is usually the simpler choice. It becomes genuinely useful in multiset, where the same value can appear many times and the range can hold all of them at once.
Iterating through a set
Range-based for loop
The cleanest way to iterate through a set is using a range-based for loop —
for (auto &value : numbers)
{
cout << value << " ";
}
Output will be – 10 20 30 40 50 60 70 80 90 100
Using iterator
We can also iterate using iterators directly —
set <int>::iterator it;
for (it = numbers.begin(); it != numbers.end(); it++)
{
cout << *it << " ";
}
💡 Note on iteration order — No matter the order in which we inserted the values, set always iterates them in sorted order (ascending by default). That is one of the most powerful features of set.
Deletion
Deletion using erase() with value
We can delete a specific element by passing the value directly to erase() —
numbers.erase(30);
After deletion, the element 30 will be removed from the set.
Deletion using erase() with iterator
We can also delete using an iterator —
auto it = numbers.find(40);
if (it != numbers.end())
{
numbers.erase(it);
}
Deletion using erase() with range
We can delete a range of elements by passing the start and end iterators —
// removing all elements from 60 onwards
numbers.erase(numbers.find(60), numbers.end());
erase_if() method (C++20)
From C++20 onwards, erase_if() is a non-member helper that removes all elements satisfying a predicate in a single pass, and returns how many were removed —
set <int> numbers = { 10, 20, 30, 40, 50 };
// remove all elements greater than 25
int removed = erase_if(numbers, [](int value) {
return value > 25;
});
cout << removed << " elements removed" << endl;
Output will be – 3 elements removed
After this, numbers will look like this —
| 10 | 20 |
clear() method
If we want to remove all elements of the set, we can use clear() —
numbers.clear();
After clear(), numbers will be empty.
Size and Empty
cout << numbers.size() << endl;
if (numbers.empty())
{
cout << "Set is empty" << endl;
}
size() returns the number of elements, and empty() returns true if the set has no elements.
Swapping
swap() method
STL set has a method named swap(), which swaps all the elements between two sets —
set <int> setOne = { 1, 2, 3 };
set <int> setTwo = { 10, 20, 30, 40, 50 };
setOne will look like this —
| 1 | 2 | 3 |
setTwo will look like this —
| 10 | 20 | 30 | 40 | 50 |
Now, let’s swap the elements of two sets —
// swapping values between two sets
setOne.swap(setTwo);
After Swapping setOne will look like this —
| 10 | 20 | 30 | 40 | 50 |
After Swapping setTwo will look like this —
| 1 | 2 | 3 |
Node Handle Operations (C++17)
From C++17 onwards, set supports node handle operations — extract() and merge(). These let us move elements between containers without copying them, and without allocating memory for them again.
extract() method
extract() removes an element from the set and returns its node handle — a small object that owns the underlying node. The node can later be inserted into another compatible container.
set <int> numbers = { 10, 20, 30, 40 };
auto node = numbers.extract(30);
if (!node.empty())
{
cout << "Extracted " << node.value() << endl;
}
Output will be – Extracted 30
After extract(), 30 is no longer in numbers. extract() comes in two flavors — pass a value to extract that element, or pass an iterator to extract the exact element the iterator points to.
💡 Why extract() is special — A node handle lets us change an element and re-insert it without the cost of allocating a brand-new node. For set and multiset the handle exposes the element through
node.value()(for map and multimap it is split intonode.key()andnode.mapped()instead). This is the only safe way to modify a key in place, since we normally cannot edit elements of a set directly.
merge() method
merge() takes another set (with the same comparator) and moves its elements into the calling set. Any element whose value already exists in the destination is left behind in the source — set never allows duplicates —
set <int> setOne = { 10, 20, 30 };
set <int> setTwo = { 30, 40, 50 };
setOne.merge(setTwo);
After merge(), setOne will look like this —
| 10 | 20 | 30 | 40 | 50 |
The 40 and 50 moved over, but the 30 — which setOne already had — stays behind in setTwo. So after the merge, setTwo still contains a single 30. This is the key difference from multiset’s merge, which would have kept the duplicate 30.
Set with pair as element
We briefly saw set< pair<int, int> > earlier in the declaration section. It is worth a closer look, because a set of pairs is genuinely useful — for example, storing unique (x, y) coordinates, or (score, id) records — and it works right out of the box with no extra setup —
set < pair<int, int> > points;
points.insert({ 3, 4 });
points.insert({ 1, 2 });
points.insert({ 1, 5 });
points.insert({ 1, 2 }); // duplicate point — silently dropped
After insertion, points is sorted and will look like this —
| (1, 2) | (1, 5) | (3, 4) |
Two things are happening here, and both come straight from set’s normal behaviour applied to pairs.
1. The pairs are sorted lexicographically. A set keeps its elements sorted, and for pairs “sorted” means lexicographic order — compare the .first values, and only when those tie, compare the .second values. That is why (1, 2) comes before (1, 5) (same first, smaller second), and both come before (3, 4) (larger first).
2. The duplicate pair is dropped. Just like a set of ints, a set of pairs keeps each element unique. The second {1, 2} is silently ignored, so points holds three pairs, not four. Two pairs are considered equal only when both their members match — so (1, 2) and (1, 5) are different elements, but (1, 2) and (1, 2) are the same.
Why no custom comparator is needed —
This is the key point. An ordered container like set only needs a way to compare two elements with <, and pair already ships with a built-in < operator (the lexicographic one described above). The same is true for tuple and string. So they all slot into a set with zero extra work.
💡 Contrast with unordered_set — This is exactly where set and unordered_set part ways. An unordered_set needs a hash function for its element type, and STL provides no default hash for
pair— sounordered_set< pair<int,int> >will not compile until we supply our own hash. A plain set has no such problem, because it sorts (needs<) instead of hashing (needs a hash). When in doubt with pair or tuple keys, set just works.
Accessing the members of a pair —
When we iterate, each element is a pair, so we reach its parts with .first and .second —
for (auto &p : points)
{
cout << "(" << p.first << ", " << p.second << ")" << " ";
}
Output will be – (1, 2) (1, 5) (3, 4)
Searching for a specific pair —
find() and count() work just as they do for ints — we just pass a whole pair —
if (points.count({ 1, 5 }))
{
cout << "(1, 5) is present" << endl;
}
auto it = points.find({ 3, 4 });
if (it != points.end())
{
cout << "Found (" << it->first << ", " << it->second << ")" << endl;
}
Output will be —
(1, 5) is present
Found (3, 4)
💡 A common use — sorting records by more than one field — Because pairs sort by first and then second, a set of pairs gives us automatic multi-level ordering for free. For example, a
set< pair<int, string> >of (age, name) keeps people ordered by age, and alphabetically by name within the same age. If we ever want a different rule — say, sort by the second member first — we can still pass a custom comparator as the second template parameter, exactly as we did withgreater<>for a descending set.
lower_bound() and upper_bound()
Since set elements are stored in sorted order, we can use lower_bound() and upper_bound() just like we do for sorted vectors.
lower_bound(value) — returns an iterator to the first element that is not less than the given value.
upper_bound(value) — returns an iterator to the first element that is strictly greater than the given value.
set <int> numbers = { 10, 20, 30, 40, 50 };
auto lb = numbers.lower_bound(30);
auto ub = numbers.upper_bound(30);
cout << *lb << endl;
cout << *ub << endl;
Output will be —
30
40
These are handy for range queries — for example, finding all values in the set between two bounds.
Accessing the smallest and largest element
Since set is always sorted, the smallest element is at begin() and the largest element is at prev(end()) or rbegin() —
set <int> numbers = { 40, 10, 50, 20, 30 };
// smallest element
cout << *numbers.begin() << endl;
// largest element
cout << *numbers.rbegin() << endl;
Output will be —
10
50
💡 Quick tip on min/max access — Unlike priority_queue (which only gives us the top element), set gives us O(log n) access to both the smallest and largest elements. This makes set a great choice when we need to repeatedly query both extremes of a collection.
Observers
key_comp() and value_comp()
These return the comparison object that the set uses to order its elements —
auto cmp = numbers.key_comp();
cout << cmp(10, 20) << endl; // is 10 ordered before 20?
Output will be – 1
For set, key_comp() and value_comp() return the same comparator, because the element is the key — there is no separate value the way there is in a map. By default this comparator is less<T>, which is exactly why elements come out in ascending order, and it is this same object we replace when we pass greater<> for a descending set.
💡 When do we actually use these? — Rarely in everyday code. key_comp() and value_comp() are mostly useful when writing generic code that should work with any ordered container, or when we want to reuse the container’s own ordering rule elsewhere instead of hard-coding it. They are the ordered-container counterpart to unordered_set’s
hash_function()andkey_eq().
Set vs unordered_set
Just like map has an unordered_map cousin, set has an unordered_set cousin. This is a common confusion, so it is worth a quick note —
| Property | set | unordered_set |
|---|---|---|
| Underlying data structure | Red-Black Tree | Hash Table |
| Order of elements | Sorted | No specific order |
| Insertion / Deletion / Search | O(log n) | O(1) average, O(n) worst |
| lower_bound() / upper_bound() | Available | Not available |
| Use when | We need sorted elements or ordered traversal | We just need fast lookup, order does not matter |
Time Complexity
| Operation | Description | Time Complexity |
|---|---|---|
| insert() | Insert an element | O(log n) |
| emplace() | Construct and insert an element in place | O(log n) |
| insert() with hint | Insert an element with a position hint | O(log n), amortized O(1) with a good hint |
| emplace_hint() | Construct and insert with a position hint | O(log n), amortized O(1) with a good hint |
| erase() | Remove an element by value or iterator | O(log n) |
| erase_if() | Remove all elements matching a predicate (C++20) | O(n) |
| extract() | Remove and return a node handle | O(log n) by value, O(1) amortized by iterator |
| merge() | Move elements from another set | O(n log n) |
| find() | Find an element | O(log n) |
| count() | Check if an element exists (returns 0 or 1) | O(log n) |
| contains() | Check if an element exists (C++20) | O(log n) |
| equal_range() | Returns a pair of iterators covering the value (0 or 1 element) | O(log n) |
| lower_bound() | Iterator to first value not less than given value | O(log n) |
| upper_bound() | Iterator to first value greater than given value | O(log n) |
| begin() / end() | Iterators to the beginning and end | O(1) |
| rbegin() / rend() | Reverse iterators (access largest, smallest) | O(1) |
| size() | Returns the number of elements | O(1) |
| empty() | Returns true if set is empty | O(1) |
| clear() | Removes all elements | O(n) |
| swap() | Swaps contents of two sets | O(1) |
| key_comp() / value_comp() | Returns the comparison object | O(1) |
Applications
-
-
-
- Removing duplicates from a collection of values
- Checking if an element has been seen before (visited-set in BFS/DFS)
- Maintaining a sorted collection that supports fast insertions and deletions
- Finding the k-th smallest or largest element via iterator advance
- Range queries with
lower_bound()andupper_bound() - Sliding window problems where we need both min and max of the current window
- Implementing set operations like union, intersection, and difference (with std::set_union, std::set_intersection, std::set_difference)
- Storing unique identifiers like user IDs, words in a dictionary, or distinct IP addresses
-
-
In the next tutorial we will talk about another useful STL container.
Until then
Happy Coding