A while back, as part of an interview, I was asked to explain what a given code sample did and how it could be improved. I am posting it here for fun.

What is findNeedles() and what does it do?

The findNeedles() method below counts how many times a word appears in a string of words. The user provides an array of words (needles) as well as the string (haystack). The method then loops through the array to see how many times each word in the array appears in the string.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
public static void findNeedles(String haystack, String[]needles){
if (needles.length>5){
System.out.println("Too many words!");
} else{
int[] countArray = new int[needles.length];
for(int i = 0; i 5){
System.err.println("Too many words!");
}
else{
for(int i = 0; i 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
for(int i = 0; i 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int i = 0; i findNeedles(String haystack, String[] needles){

if(needles.length > 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
Map<string integer> countHash = new HashMap<string>();
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int i = 0; i entry : countHash.entrySet()) {
String key = entry.getKey();
int wordCount = entry.getValue();
System.out.println(key + ": " + wordCount);
}
}
return countHash;
}
}

The method does the following:

  1. Checks that the user is providing fewer than 5 needles.
  2. Creates a new array of integers (countArray) equal to the length of needles. This creates a counter for each element in our needles array.
  3. Splits the haystack string at each instance of single spaces, single quotes, double quotes, tabs, newline characters, backspaces (word boundaries), form feeds, and carriage returns and assigns it to a new String array (words), which now represents each word in the original haystack.
  4. Compares each element in the needles array to each element of the haystack string. If a “needle” is found in your words array, the respective placeholder integer in countArray is incremented.
  5. Prints out each element in needle along with the number of times it occurred in haystack.

How can this code be improved?

There are several ways to improve the code. First, this method is very difficult to test because it returns nothing. If you returned countArray instead of simply printing it out, it would be easier to test.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public static int[] findNeedles(String haystack, String[] needles){
int[] countArray = new int[needles.length];
if(needles.length > 5){
System.err.println("Too many words!");
}
else{
for(int i = 0; i < needles.length; i++){
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int j = 0; j < words.length; j++){
if(words[j].compareTo(needles[i]) == 0){
countArray[i]++;
}
}
}
for (int j = 0; j < needles.length; j++) {
System.out.println(needles[j] + ": " + countArray[j]);
}
}
return countArray;
}

You should also throw an exception instead of printing out a message.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public static int[] findNeedles(String haystack, String[] needles){
int[] countArray = new int[needles.length];
if(needles.length > 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
for(int i = 0; i < needles.length; i++){
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int j = 0; j < words.length; j++){
if(words[j].compareTo(needles[i]) == 0){
countArray[i]++;
}
}
}
for (int j = 0; j < needles.length; j++) {
System.out.println(needles[j] + ": " + countArray[j]);
}
}
return countArray;
}

Additionally, you can split the string (haystack) outside of the loop. This way, the split occurs once instead of each time the for loop iterates. This makes the method faster and reduces memory usage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public static int[] findNeedles(String haystack, String[] needles){
int[] countArray = new int[needles.length];
if(needles.length > 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int i = 0; i < needles.length; i++){
for(int j = 0; j < words.length; j++){
if(words[j].compareTo(needles[i]) == 0){
countArray[i]++;
}
}
}
for (int j = 0; j < needles.length; j++) {
System.out.println(needles[j] + ": " + countArray[j]);
}
}
return countArray;
}

Finally, instead of looping over the entire haystack and checking to see if your “needle” is found, the most efficient way is to store the needles is to use a HashMap and run through the haystack once, incrementing the value associated with the key (in this case, the needle) each time an instance is found.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public static Map<String,Integer> findNeedles(String haystack, String[] needles){
if(needles.length > 5) {
throw new RuntimeException("Must provide fewer than 5 arguments.");
}
else{
Map<String, Integer> countHash = new HashMap<String,Integer>();
String[] words = haystack.split("[ \"\'\t\n\b\f\r]", 0);
for(int i = 0; i < needles.length; i++){
countHash.put(needles[i],0);
}
for(String word : words){
if(countHash.containsKey(word)){
countHash.put(word,countHash.get(word)+1);
}
}
for (int j = 1; j < needles.length; j++) {
for (Map.Entry<String, Integer> entry : countHash.entrySet()) {
String key = entry.getKey();
int wordCount = entry.getValue();
System.out.println(key + ": " + wordCount);
}
}
return countHash;
}
}