Partition a collection into smaller collections. This article describes how to partition a collections into a given number of smaller collections.
1. Partition a Collection
Sometimes, you may want to split a collection into smaller collections for easier processing or analysis. This technique, known as partitioning, can be particularly useful in various scenarios, such as data processing, load balancing, or when implementing algorithms that require segmented data.
The following example demonstrates how to partition a collection in Java. The test code also calculates the number of elements that should go into each bucket when a fixed number of buckets is desired.
2. Partition Collection in Java
2.1. Implementation
Create a Java project named com.vogella.algorithms.partitioncollection
.
Create the following program.
package com.vogella.algorithms.partitioncollection;
import java.util.AbstractList;
import java.util.List;
public class MyPartition {
/**
* Returns consecutive {@linkplain List#subList(int, int) sublists} of a list,
* each of the same size (the final list may be smaller). For example,
* partitioning a list containing {@code [a, b, c, d, e]} with a partition
* size of 3 yields {@code [[a, b, c], [d, e]]} -- an outer list containing
* two inner lists of three and two elements, all in the original order.
*
* <p>The outer list is unmodifiable but reflects the latest state of the
* source list. The inner lists are sublist views of the original list,
* produced on demand using {@link List#subList(int, int)}, and are subject
* to all the usual caveats about modification as explained in that API.
*
* * Adapted from http://code.google.com/p/google-collections/
*
* @param list the list to return consecutive sublists of
* @param size the desired size of each sublist (the last may be
* smaller)
* @return a list of consecutive sublists
* @throws IllegalArgumentException if {@code partitionSize} is nonpositive
*/
public static <T> List<List<T>> partition(List<T> list, int size) {
if (list == null)
throw new NullPointerException("'list' must not be null");
if (!(size > 0))
throw new IllegalArgumentException("'size' must be greater than 0");
return new Partition<T>(list, size);
}
private static class Partition<T> extends AbstractList<List<T>> {
final List<T> list;
final int size;
Partition(List<T> list, int size) {
this.list = list;
this.size = size;
}
@Override
public List<T> get(int index) {
int listSize = size();
if (listSize < 0)
throw new IllegalArgumentException("negative size: " + listSize);
if (index < 0)
throw new IndexOutOfBoundsException("index " + index + " must not be negative");
if (index >= listSize)
throw new IndexOutOfBoundsException("index " + index + " must be less than size " + listSize);
int start = index * size;
int end = Math.min(start + size, list.size());
return list.subList(start, end);
}
@Override
public int size() {
return (list.size() + size - 1) / size;
}
@Override
public boolean isEmpty() {
return list.isEmpty();
}
}
}
2.2. Test
You can use the following JUnit 5 test to validate the result.
package com.vogella.algorithms.partitioncollection;
import java.util.ArrayList;
import java.util.List;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class MyPartitionTest {
@Test
public void partitionTest() {
List<String> list = new ArrayList<>();
list.add("one");
list.add("two");
list.add("three");
list.add("four");
list.add("five");
list.add("six");
list.add("seven");
list.add("eight");
list.add("nine");
list.add("ten");
list.add("eleven");
List<List<String>> partition = MyPartition.partition(list, 1);
assertEquals(11, partition.size());
assertEquals(1, partition.get(0).size());
partition = MyPartition.partition(list, 7);
assertEquals(2, partition.size());
assertEquals(7, partition.get(0).size());
assertEquals(4, partition.get(1).size());
// Determine how many elements should be placed in each bucket for a fixed number of buckets
int buckets = 3;
int divide = list.size() / buckets;
if (list.size() % buckets != 0) {
divide++;
}
System.out.println("Max. number of elements in each bucket: " + divide);
partition = MyPartition.partition(list, divide);
assertEquals(buckets, partition.size());
}
}
2.3. Performance Considerations
When partitioning collections, the time complexity is primarily O(N), where N is the number of elements in the original list. This is because the implementation iterates through the list to create sublists. Keep in mind that performance can vary based on the size of the collection and the partition size. For very large collections, consider the memory implications of creating multiple sublists.
2.4. Real-World Use Cases
Partitioning collections can be useful in several real-world scenarios, such as:
-
Data Processing: When handling large datasets, it can be efficient to process data in smaller batches.
-
Load Balancing: In distributed systems, partitioning data can help evenly distribute workloads across multiple servers.
-
Algorithm Implementation: Certain algorithms may require data to be divided into segments, allowing for more efficient processing or analysis.
By understanding how to partition collections, you can better manage and utilize your data in various applications.
3. Links and Literature
Nothing listed.
3.1. vogella Java example code
If you need more assistance we offer Online Training and Onsite training as well as consulting