Understanding the Problem
The problem statement is as follows: Given an array of words, return the k most frequent words. If there are multiple answers, return answers in their lexicographical order
The Code
We’ll break down the code into sections to understand how it works.
Custom Data Structure: Frequency
First, we define a custom class called Frequency
to hold each word’s frequency and the word itself. This class allows us to associate a word with its frequency in a clean and organized way.
class Frequency {
int frequency;
String word;
Frequency(String w, int f) {
frequency = f;
word = w;
}
}
Custom Comparator: CustomComparator
Next, we create a custom comparator called CustomComparator
. This comparator will help us define how the elements in our priority queue (heap) should be compared and ordered. In this case, we want to prioritize words with higher frequencies. If two words have the same frequency, we order them lexicographically (alphabetically).
class CustomComparator implements Comparator<Frequency> {
public int compare(Frequency a, Frequency b) {
if (a.frequency != b.frequency) {
return b.frequency - a.frequency;
}
return a.word.compareTo(b.word);
}
}
The topKFrequent
Method
The core of the solution lies in the topKFrequent
method. Here’s a step-by-step breakdown of how it works:
- We initialize a HashMap (
map
) to store each word’s frequency. - We iterate through the input array of words and populate the
map
. If a word is already in themap
, we increment its frequency; otherwise, we add it to themap
with a frequency of 1. - We create a PriorityQueue (
queue
) to storeFrequency
objects. We use the custom comparatorCustomComparator
to order the elements in the queue. This will ensure that words with higher frequencies are at the front of the queue. - We iterate through the entries in the
map
and add eachFrequency
object to thequeue
. - Finally, we create a list (
ans
) to store the top k frequent words. We repeatedly poll thequeue
(which always contains the most frequent word at the front) and add the word to theans
list until we have retrieved k words.
Here’s the complete code:
class Frequency {
int frequency;
String word;
Frequency(String w, int f) {
frequency = f;
word = w;
}
}
class CustomComparator implements Comparator<Frequency> {
public int compare(Frequency a, Frequency b)
{
if (a.frequency != b.frequency){
return b.frequency - a.frequency;
}
return a.word.compareTo(b.word);
}
}
class Solution {
public List<String> topKFrequent(String[] words, int k) {
HashMap<String, Integer> map = new HashMap<String, Integer>();
for(int i=0;i<words.length;i++) {
if(map.containsKey(words[i])) {
map.put(words[i], map.get(words[i]) + 1);
} else {
map.put(words[i], 1);
}
}
PriorityQueue<Frequency> queue = new PriorityQueue<Frequency>(new CustomComparator());
for (Map.Entry<String,Integer> entry : map.entrySet()) {
queue.add(new Frequency(entry.getKey(), entry.getValue()));
}
List<String> ans = new ArrayList<String>();
for(int i=0;i<k;i++) {
ans.add(queue.poll().word);
}
return ans;
}
}
Conclusion
In this blog, we’ve explored a Java solution to the “Top K Frequent Words” problem using a combination of data structures, including HashMap and PriorityQueue. By creating a custom comparator, we’ve ensured that the most frequent words are extracted efficiently. This solution can be a valuable tool in various natural language processing tasks and can help you gain insights into textual data by identifying the most relevant and frequently occurring words.
Problem Statement: 692. Top K Frequent Words