I often find myself in a position where I have a collection of objects and I need to determine if a particular element exists in the list. In the past (versions of C# prior to 2.0) whenever I had a list I knew I was going to have to search through I would make use of a Dictionary object. The Dictionary object provides built-in very fast searching. In my pre-.Net life I was a VB6 developer and was always forced to write my own quick sort and binary search routine for such a task. Fortunately for us Microsoft has seen the value and need of making this available in the framework.
Aren’t Dictionaries Overkill?
The difference between a generic List however, and a Dictionary are that the Dictionary requires a name and a value for each item in your collection. What if you only want a list of values? I have to admit I have made up some “names” just to appease the requirements of the Dictionary object so I can add my values and take advantage of the great searching capabilities already provided by the dictionary object. Whenever I do this though it never quite sits right with me. No, I’m not worried about memory overhead. I haven’t even checked memory usage but that’s because there really wouldn’t be any point with the amount of memory available on even relatively older machines compared to the amount of memory a few thousand integers would take. No, the nagging feeling has always been that it’s a work around. I’m using a screw driver to do the job of a hammer in some ways and it doesn’t sit right with me.
Let’s do it right
One day I decided enough was enough. I was going to figure out a way to use built-in framework methods to do searching on a generic list. Looking at the intellisense for an instantiated List<string> showed there were a few methods that might do exactly what I needed. (In retrospect I do believe these are methods I had seen and tried to use before but without any luck. You may see here shortly why I gave up the first time I tried, it is certainly not implemented in an intuitive way). Those methods are Find, FindIndex, and FindAll. That might not be a complete list but these three all work the same and so any others may very well as well. I’m also not going to show how to do this on all three methods since, as stated earlier, they work the same.
Searching for the answer
My quest to begin using these methods started with the intellisense. I looked at the parameters for all of the overloads on the Find method. I noticed a recurring parameter it expected was a “predicate” which excuse my ignorance but I had no idea what that was. Since each of the overloads required a predicate, and I didn’t know what one was, I knew my searching through the intellisense quest was over. I needed an example of how to use this. So I did a Google search, something like [Generic List Find] (without the brackets). The first result was to this MSDN article http://msdn2.microsoft.com/en-us/library/x0b5b5bc.aspx . Below is the relevant code excerpt from the article.
[code lang="c#"]
using System;
using System.Collections.Generic;
public class Example
{
public static void Main()
{
List dinosaurs = new List();
dinosaurs.Add("Compsognathus");
dinosaurs.Add("Amargasaurus");
dinosaurs.Add("Oviraptor");
dinosaurs.Add("Velociraptor");
dinosaurs.Add("Deinonychus");
dinosaurs.Add("Dilophosaurus");
dinosaurs.Add("Gallimimus");
dinosaurs.Add("Triceratops");
Console.WriteLine();
foreach(string dinosaur in dinosaurs)
{
Console.WriteLine(dinosaur);
}
Console.WriteLine("nTrueForAll(EndsWithSaurus): {0}",
dinosaurs.TrueForAll(EndsWithSaurus));
Console.WriteLine("nFind(EndsWithSaurus): {0}",
dinosaurs.Find(EndsWithSaurus));
Console.WriteLine("nFindLast(EndsWithSaurus): {0}",
dinosaurs.FindLast(EndsWithSaurus));
Console.WriteLine("nFindAll(EndsWithSaurus):");
List sublist = dinosaurs.FindAll(EndsWithSaurus);
foreach(string dinosaur in sublist)
{
Console.WriteLine(dinosaur);
}
Console.WriteLine(
"n{0} elements removed by RemoveAll(EndsWithSaurus).",
dinosaurs.RemoveAll(EndsWithSaurus));
Console.WriteLine("nList now contains:");
foreach(string dinosaur in dinosaurs)
{
Console.WriteLine(dinosaur);
}
Console.WriteLine("nExists(EndsWithSaurus): {0}",
dinosaurs.Exists(EndsWithSaurus));
}
// Search predicate returns true if a string ends in "saurus".
private static bool EndsWithSaurus(String s)
{
if ((s.Length &gt; 5) &amp;&amp;
(s.Substring(s.Length - 6).ToLower() == "saurus"))
{
return true;
}
else
{
return false;
}
}
}
[/code]
I won’t go into detail on how this code works. Please visit the article directly for that. But I will point out one big problem with this example. The method that tells Find how to determine if it has ‘found’ what it is looking for has a hard-coded string in it. It is searching for a hard-coded value. This method is the “EndsWithSaurus” method. Notice how it is doing a string comparison on “saurus”.
This doesn’t help me at all! Unless I know at design time what I would be searching for. But what if I have two lists. One list is a master list of countries in Asia and the other list is a supposed subset of that list. Let’s say it is a list provided from user input. I want to validate the list the user has provided contains only countries that exist in Asia. So I need to iterate through each item in the user-provided list and for each iteration of the loop determine if the current item exists in the Asia master list. Something like this:
[code lang='csharp']
private void validateUserInput(List listWithValuesEnteredByUser)
{
List asiaList = MethodThatReturnsAllCountriesInAsia();
foreach (string current in listWithValuesEnteredByUser)
{
// determine here if the value exists
}
}
[/code]
Now I’m suddenly in a sport where I can’t hardcode the search routine that Find, FindIndex or FindAll might use. We all know that hard coding is a bad idea anyway so we shouldn’t get away with it even if we can but you see my point. And since we all know hard coding is bad, we all also know making use of a wider scoped variable in this case would be bad as well right? We shouldn’t even look twice at that temptation. It’s a hack and a work around as seen here:
[code lang='csharp']
// ugly work around hack
string _widelyScopedVariable = string.Empty;
private void validateUserInput(List listWithValuesEnteredByUser)
{
List asiaList = MethodThatReturnsAllCountriesInAsia();
foreach (string current in listWithValuesEnteredByUser)
{
// ugly work around hack
_widelyScopedVariable = current;
if (string.IsNullOrEmpty(asiaList.Find(findCountry)))
{
// country not found. user input found to be invalid
}
}
}
private bool findCountry(string input)
{
// ugly work around hack
return _widelyScopedVariable == input;
}
[/code]
The Answer
So what is the correct way of doing the above? Anonymous delegates. With anonymous delegates we can embed a method that will share the scope of our loop. This means it will have access to “current” which is what we are searching for. Here is some sample code:
[code lang='csharp']
private void validateUserInput(List listWithValuesEnteredByUser)
{
List asiaList = MethodThatReturnsAllCountriesInAsia();
foreach (string current in listWithValuesEnteredByUser)
{
if (string.IsNullOrEmpty(asiaList.Find(delegate(string input)
{
return (input == current);
})))
{
// country not found. user input found to be invalid
}
}
}
[/code]
Wrapup
Not only does this significantly shorten the code we have to write (a definite plus) but we also don’t have to use any ugly hacks.
It is important to note that “input” is defined as a string because the list I am searching through is a list of strings (List). If for instance you have a generic list defined like: List<Animal> then the parameter changes from “string input” to “Animal input”.
It is also important to say that since it cannot be assumed that your generic list is sorted and in some cases can’t be sorted it is not possible for this built-in routine to search your list using a binary search and therefore is likely to be less efficient than the dictionary search.