
Bucket Sort is a sorting algorithm that is especially useful when the data is uniformly distributed across a known range. Instead of comparing elements repeatedly, it assigns values to “buckets” based on their range and sorts within those smaller groups.
While comparisons are still used inside each bucket, the algorithm reduces overall work by limiting sorting to localized subsets of the data. For this reason, Bucket Sort is best understood as a distribution-based sorting technique.
Its efficiency depends heavily on how evenly the input data is distributed. When values are spread uniformly, it can achieve an average time complexity of O(n + k), where n is the number of elements and k is the number of buckets. Uneven distributions can lead to imbalanced buckets and reduced performance.
In this article, we will explore how Bucket Sort works, implement it in Java, and discuss when it is most effective.
What Is Bucket Sort? 🧺
Bucket Sort is a sorting algorithm that divides input elements into a number of buckets, each representing a specific value range. A bucket is simply a temporary container (such as a list or array) used to group elements that fall within the same range.
Once all elements are distributed into their respective buckets, each bucket is sorted individually using a suitable sorting method. Finally, the sorted buckets are combined (concatenated) to produce the fully sorted output.

This approach is especially efficient when the input values are evenly distributed across a known range, because each bucket contains only a small number of elements, making them quick to sort. However, if the data is unevenly distributed, some buckets may become overloaded while others remain nearly empty, which can reduce overall efficiency.
When Should You Use Bucket Sort? 🧐
Bucket Sort is most effective when input values are evenly distributed over a known range. In such cases, elements are spread fairly evenly across buckets, keeping each bucket small and ensuring efficient local sorting.
Good example: Sorting 1 million floating-point numbers uniformly distributed between 0 and 1. Each bucket receives roughly the same number of elements, keeping the workload balanced.
Bad example: Sorting values where 99% of the data lies between 0 and 0.1, while the rest is spread across 0.1 to 1.0. One bucket becomes overloaded, and performance degrades significantly.
From Theory to Practice 🛠️
To better understand how Bucket Sort works in practice, let’s look at a Java implementation that demonstrates the full process.
1. Core Implementation
import java.util.*;
public class Main {
public static void bucketSort(double[] arr) {
int n = arr.length;
if (n <= 0) return;
// Step 1: Create buckets
List<List<Double>> buckets = new ArrayList<>();
for (int i = 0; i < n; i++) {
buckets.add(new ArrayList<>());
}
// Step 2: Distribute elements into buckets
for (double value : arr) {
int index = (int)(value * n);
buckets.get(index).add(value);
}
// Step 3: Sort each bucket
for (List<Double> bucket : buckets) {
// Step 4: Use Java's built-in sort() to sort individual buckets
Collections.sort(bucket);
}
// Step 5: Concatenate buckets
int idx = 0;
for (List<Double> bucket : buckets) {
for (double value : bucket) {
arr[idx++] = value;
}
}
}
public static void main(String[] args) {
double[] arr = {0.42, 0.32, 0.23, 0.52, 0.25, 0.47};
bucketSort(arr);
for (double num : arr) {
System.out.print(num + " ");
}
}
}
This may look similar to divide-and-conquer algorithms, but the key difference is how data is grouped before sorting even begins. Instead of repeatedly splitting the array, Bucket Sort organizes values by range, allowing each group to be handled efficiently.
2. Why Bucket Sort Can Be So Fast
The strength of Bucket Sort comes from reducing unnecessary work. When the input is uniformly distributed, elements are spread evenly across buckets, meaning each bucket contains only a few items.
In this ideal case, sorting within each bucket becomes trivial, and overall performance approaches O(n).
Unlike Quick Sort or Merge Sort, performance here is not driven by comparisons, but by how evenly the data is distributed.
3. The Trade-Off: Memory and Distribution
This performance gain comes at the cost of additional memory for storing buckets, making it less space-efficient than in-place algorithms.
More importantly, its efficiency depends heavily on how evenly data is distributed across buckets.
If many values fall into the same bucket, that bucket becomes a bottleneck, and the algorithm effectively degrades to the performance of the sorting method used inside it.
4. Flexibility in Implementation
One of the most interesting aspects of Bucket Sort is its flexibility. The algorithm does not require a fixed sorting method inside each bucket — the choice can be adapted based on bucket size and data characteristics.
For small buckets, simple algorithms like Insertion Sort are often ideal due to their low overhead and strong performance on nearly sorted data.
For larger buckets, more efficient algorithms like Quick Sort or Merge Sort are typically better choices, as they scale more effectively.
In practice, many implementations use a hybrid approach — for example, using Insertion Sort for small buckets and switching to a more advanced algorithm when a bucket exceeds a size threshold (e.g., 10–20 elements).
This makes Bucket Sort more of a framework than a fixed algorithm, allowing it to be tuned for different scenarios and datasets.
5. Bucket Sort with Objects
Bucket Sort can also be applied to objects by mapping them into buckets based on a key value.
Here is an example using objects sorted by a normalized score:
import java.util.*;
class Student {
String name;
double score; // between 0 and 1
Student(String name, double score) {
this.name = name;
this.score = score;
}
@Override
public String toString() {
return name + " (" + score + ")";
}
}
public class Main {
public static void main(String[] args) {
List<Student> students = List.of(
new Student("Alice", 0.78),
new Student("Bob", 0.65),
new Student("Charlie", 0.92),
new Student("David", 0.70)
);
System.out.println("Before sorting: " + students);
List<Student> sorted = bucketSort(students);
System.out.println("After sorting: " + sorted);
}
public static List<Student> bucketSort(List<Student> list) {
int n = list.size();
List<List<Student>> buckets = new ArrayList<>();
for (int i = 0; i < n; i++) {
buckets.add(new ArrayList<>());
}
for (Student s : list) {
int index = (int)(s.score * n);
buckets.get(index).add(s);
}
for (List<Student> bucket : buckets) {
bucket.sort(Comparator.comparingDouble(s -> s.score));
}
List<Student> result = new ArrayList<>();
for (List<Student> bucket : buckets) {
result.addAll(bucket);
}
return result;
}
}
Each student is placed into a bucket based on their score, and each bucket is sorted using a comparator.
Where Bucket Sort Becomes Useful 💡
Bucket Sort is especially powerful in systems where data is naturally partitionable by range or key, and where processing can be split across multiple workers. Its main strength is not just sorting speed, but the ability to divide data into independent chunks that can be processed in parallel and later combined into a final result.
1. Distributed Log Processing and Analytics
In large-scale distributed systems such as logging platforms, monitoring tools, or observability pipelines, enormous volumes of data are continuously generated across many services. A single system might receive millions of latency measurements per minute from different servers.
Instead of sorting all this data in one place, a more scalable approach is to group values into buckets based on ranges, such as response time intervals. For example, latencies might be divided into 0–100ms, 100–200ms, 200–500ms, and 500ms+. This immediately transforms a large, unstructured dataset into smaller, structured groups that can be processed independently.
Once the data is partitioned, each bucket can be processed independently — for example, sorted locally — after which the buckets can be combined to produce a globally ordered result. In a real distributed system, these buckets could even be handled in parallel by different threads or machines.
Here is a simplified Java example that simulates this kind of distributed-style log processing:
import java.util.*;
public class Main {
public static void main(String[] args) {
int[] latencies = {
12, 45, 120, 180, 250, 320, 410, 480,
55, 75, 95, 110, 190, 210, 330, 490
};
List<List<Integer>> buckets = List.of(
new ArrayList<>(), // 0–100
new ArrayList<>(), // 100–200
new ArrayList<>(), // 200–500
new ArrayList<>() // 500+
);
// Step 1: Distribute into buckets
for (int value : latencies) {
if (value <= 100) buckets.get(0).add(value);
else if (value <= 200) buckets.get(1).add(value);
else if (value <= 500) buckets.get(2).add(value);
else buckets.get(3).add(value);
}
// Step 2: Sort each bucket
for (List<Integer> bucket : buckets) {
Collections.sort(bucket);
}
// Step 3: Concatenate buckets
List<Integer> sorted = new ArrayList<>();
for (List<Integer> bucket : buckets) {
sorted.addAll(bucket);
}
System.out.println(sorted);
}
}
// Output:
// [12, 45, 55, 75, 95, 110, 120, 180, 190, 210, 250, 320, 330, 410, 480, 490]
What this demonstrates is a simplified version of how distributed observability systems can work internally: data is first grouped by range, then each group is processed (and sorted) independently, and finally the results are combined into a globally ordered dataset.
2. Distributed Data Sharding by Key Range
A very similar idea appears in distributed databases and large-scale storage systems, but instead of analytics values, the system partitions keys or identifiers across multiple nodes.
Imagine a system where user IDs range from 0 to 1 billion. Storing and processing all users on a single machine would be inefficient and unscalable. Instead, the system divides the key space into fixed ranges and assigns each range to a different node. One node might handle IDs from 0–250 million, another from 250–500 million, and so on.
Each node is responsible for only its own portion of the data. This means queries, updates, and even internal sorting operations happen locally within that shard. When a request comes in, the system can directly route it to the correct node based on the ID range, avoiding unnecessary lookups elsewhere.
In its basic form, this approach focuses on partitioning rather than sorting. However, when each shard is sorted locally and the results are merged, it closely mirrors the behavior of Bucket Sort.
Here is a simplified Java example that simulates how data might be distributed across shards in such a system:
import java.util.*;
public class Main {
static class User {
int id;
String name;
User(int id, String name) {
this.id = id;
this.name = name;
}
public String toString() {
return "(" + id + ", " + name + ")";
}
}
public static void main(String[] args) {
List<User> users = List.of(
new User(50, "Alice"),
new User(300_000_000, "Bob"),
new User(600_000_000, "Charlie"),
new User(900_000_000, "David"),
new User(120, "Eve")
);
List<List<User>> shards = List.of(
new ArrayList<>(), // 0–250M
new ArrayList<>(), // 250M–500M
new ArrayList<>(), // 500M–750M
new ArrayList<>() // 750M–1B
);
// Step 1: Distribute
for (User u : users) {
if (u.id <= 250_000_000) shards.get(0).add(u);
else if (u.id <= 500_000_000) shards.get(1).add(u);
else if (u.id <= 750_000_000) shards.get(2).add(u);
else shards.get(3).add(u);
}
// Step 2: Sort each shard
for (List<User> shard : shards) {
shard.sort(Comparator.comparingInt(u -> u.id));
}
// Step 3: Merge shards
List<User> sorted = new ArrayList<>();
for (List<User> shard : shards) {
sorted.addAll(shard);
}
System.out.println(sorted);
}
}
// Output:
// [(50, Alice), (120, Eve), (300000000, Bob), (600000000, Charlie), (900000000, David)]
This reflects how real distributed systems scale: instead of processing everything globally, they divide the dataset into ranges and let each node handle its own segment independently. When combined with local sorting and ordered merging, this approach effectively mirrors Bucket Sort at scale.
Conclusion 📣
Bucket Sort stands out not because it eliminates comparisons entirely, but because it changes where the work happens. By distributing elements into value-based buckets first, it reduces the complexity of sorting from a global problem into a set of smaller, more manageable local ones.
When data is uniformly distributed, this approach can be extremely efficient, achieving near-linear performance and outperforming many comparison-based algorithms. However, this strength is also its limitation — performance depends heavily on how well the data fits the chosen bucket ranges.
Beyond academic examples, Bucket Sort offers an important perspective for real-world system design. Its core idea — partition first, process locally, then combine results — appears in distributed systems, data pipelines, and large-scale storage architectures. In these contexts, it becomes less about sorting itself and more about scalability, parallelism, and efficient data organization.
Ultimately, Bucket Sort is best viewed not just as a single algorithm, but as a strategy. When applied in the right context, it can transform both how data is sorted and how systems are designed to handle it.
Happy coding! 😃
🙏 Thanks for reading to the end! If you have any questions, feel free to drop a comment below.
If you enjoyed this article, follow me on medium or social media — I’d love to connect and follow you back:
Bucket Sort Algorithm In Java: Learn with Practical Examples was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.