"To one" or "To many" that's the question.
In the last chapter, we have learnt about different generic list-based containers. Those are great for maintaining simple lists. However, they are not the best for storing associative relationships among different entities.
For example:
You might want to know how many electronic gadgets are available from
amazon.com
that are below $20A football fanatic friend of mine wants to keep a count of how many times Real Madrid won over other teams in recent times
Your friend at the local electronics store dreams about a system that can auto-complete electronic part names
These are some of the examples you can think of, where you would need a mechanism to associate one variable with another. The fun factor is that all these situations can be represented conceptually using the same structure—the same structure where some fields will be accessible very quickly using some other unique variable. This structure is known as Dictionary in C#. Using Generics, you can build a common structure that can be used in different applications.
Reading this chapter and following the exercises, you shall learn the following:
Types of Generic Associative Structures and how to use them
How to create your own custom generic associative structure
There are two types of associative containers/structures available in .NET Generics. They are as follows:
Key-value pairs can be of two types. One type allows duplicate keys. These are represented by a collection of KeyValuePair<TKey,TValue>
objects. The other type doesn't allow duplicate keys. These are represented by Dictionary<TKey,TValue>
and SortedDictionary<TKey,TValue>
. SortedDictionary<TKey,TValue>
keeps the keys in sorted order unlike Dictionary<TKey,TValue>
, where the keys are not stored in any particular order.
Tuples are a new inclusion to the .NET 4.0 Framework and like KeyValuePair<TKey,TValue>
they also allow duplicate keys. However, KeyValuePair<TKey,TValue>
can be viewed as a special case of a Tuple. A Tuple allows you to store the relationship between n different variables (they may be of different types). However a Key-value pair is what the name suggests. It's a pair. Only one variable is associated with the other. If you need to store a relationship between three or more variables, you need a Tuple.
Tag cloud is a fascinating way to visualize content. Tag clouds, generated from text sources, give an impression about the text. It speaks for the text.
Here is a tag cloud that I generated pointing my program to Apple's iPad page. It's almost evident from the tag cloud that the Apple iPad will arrive at the store from April and you can pre-order. All this without even rolling our eyeballs over the page:
Create a console application. Call it TagCloud
.
Add the following in the Main()
method of Program.cs
:
Dictionary<string, int> wordHistogram = new Dictionary<string, int>(); wordHistogram.Add("Apple",33); wordHistogram.Add("Orange",85); wordHistogram.Add("Strawberry",20); wordHistogram.Add("Watermelon",150); wordHistogram.Add("Guava", 52); wordHistogram.Add("Grape", 80); ShowCloud(wordHistogram);
Copy the Sample Code
from the following location and save it as TagCloudData.txt
in the in the C:
drive:
http://visapi-gadgets.googlecode.com/svn/trunk/termcloud/doc.html
Replace the highlighted part of this file (as shown in the following screenshot) with a place holder !DATA!
:
Delete the highlighted lines and replace these with !DATA!
.
So, after replacement it will look something similar to the following screenshot:
Add the following method in Program.cs
:
private static void ShowCloud(Dictionary<string,int> wordHistogram) { //Creating the data to replace the placeholder StringBuilder cloudBuilder = new StringBuilder(); cloudBuilder.AppendLine("data.addRows(" + wordHistogram.Keys.Count.ToString() + ")"); int i = 0; foreach (string word in wordHistogram.Keys) { cloudBuilder.AppendLine("data.setValue(" + i.ToString() + ",0,'" + word + "'),"); cloudBuilder.AppendLine("data.setValue(" + i.ToString() + ",1," + wordHistogram[word] + ");"); i++; } //Replacing the placeholder //If you can't put the file in C: drive. Put the file //anywhere else where you have access StreamReader sr = new StreamReader("C:\TagCloudData.txt"); string total = sr.ReadToEnd().Replace("!DATA!", cloudBuilder.ToString()); sr.Close(); //Writing the content to a temporary local file StreamWriter sw = new StreamWriter("C:\Cloud.html"); sw.Write(total); sw.Close(); //Showing the generated cloud. System.Diagnostics.Process.Start("C:\Cloud.html"); }
Compile and run the application. You will need to be connected to the internet while using this application as it uses the Google Visualization API.
As you execute the program, it will launch your default browser and will show the following tag cloud:
Watermelon is the biggest tag, because it has the highest frequency assigned (150). A word histogram is nothing but a tabular view of a tag cloud. Every word has a weight. In the real world, the weight can be the frequency of occurrence in a text.
The declaration: Dictionary<string, int> wordHistogram = new Dictionary<string, int>();
creates a dictionary, where the keys are of type string and the value is of type integer. This dictionary is used to store the histogram of the words. As you see, Strawberry has the lowest weight and Watermelon has the largest. So, they appear smallest and largest in the tag cloud respectively.
We are using the Google Visualization API for generating the tag cloud. So, we somehow need to replace the hardcoded values from the sample HTML file. That's being done in the method ShowCloud()
.
All the keys can be obtained by the
Keys
property of the Dictionary
class. It returns a KeyCollection
object. Dictionary
—similar to SortedList
in the previous chapter—implements the IDictionary
interface and thus, we can find the value associated with a key by indexing, using the key.
Thus, wordHistogram[word]
will return the weight associated with the word in the dictionary wordHistogram
.
We have now created a console application to generate tag clouds from a single source. Can you extend the program, such that it can generate tag clouds from several text sources?
Hint: You can use a dictionary of dictionaries as follows:
Dictionary<string, Dictionary<string,int>>
The first string in the preceding dictionary is the type of key of the outer dictionary which represents the source file paths, and the inner dictionary stores the frequency of each word appearing in those files; or in other words histogram for those files.
Which structure would you use when mapping country versus capital?
Dictionary<string,string>
Dictionary<string,List<string>>
Which structure would you use when mapping country versus state-wise per-capita income? Say, we represent per-capita income in decimal.
Dictionary<string,decimal>
Dictionary<string, Dictionary<string,decimal>>
Dictionary<string,List<decimal>>
How would you refer to the stock price of MSFT in the month of June 2011, if they are stored like this in a dictionary:
Dictionary<string,Dictionary<string,double>> stockPrices = new Dictionary<string,Dictionary<string,double>>(); stockPrices.Add("MSFT",new Dictionary<string,double>()); stockPrices["MSFT"].Add("May2011",24.22);//Hypothetical stock price. stockPrices["MSFT"].Add("June2011",34.22);//Hypothetical stock price.
Sometimes we all get bored and need an easy break from boredom. Bubble wrap popping is one of the most popular games. It doesn't require any brain power, but it is still a lot of fun. I have seen people popping virtual bubbles for several reasons. It acts very well as a stress buster too. My wife loves this game.
This is a paradise for any bubble popper. It launches with a screen full of bubbles. As you click on a bubble it emits a realistic bubble pop sound and changes the look of the pattern such that it looks as if the bubble you just clicked is punctured. So it looks somewhat similar to the following screenshot. I have minimized it:
This is a simple game and we can make it using C# Dictionary. Let's get into action.
Create a Windows application.
Add the following code in Form1.cs
:
Dictionary<string, bool> bubbleStatus = new Dictionary<string,bool>(); private void Form1_Load(object sender, EventArgs e) { //Lets find the width of the form int totalWidth = this.Width; //Lets find the height of the form int totalHeight = this.Height; //Lets see how many bubbles we can accommodate per row int perRow = totalWidth / 45; //Lets see how many bubbles we can accommodate per column int perCol = totalHeight / 45; //Bubbles will be images on picturebox controls //This is the first bubble. PictureBox pic = new PictureBox(); pic.Name = "pic0"; pic.Cursor = System.Windows.Forms.Cursors.Hand; pic.Width = 45; pic.Height = 45; //Bubbles have to match the background image //of the form for a realistic look pic.BackgroundImage = Image.FromFile("C:\bubbleBackGround.jpg"); //Loading the normal bubble image pic.ImageLocation = "C:\bubble.jpg"; //Attaching a click event to handle the click. //When user clicks a bubble it should play the pop sound //and change the image to a popped bubble image to //create a more realistic special effect pic.Click += new EventHandler(pic_Click); //adding the first bubble to the form's control this.Controls.Add(pic); //Remembering where we painted the first bubble on the form Point lastLocation = pic.Location; //Making a copy, we are going to need this Point firstLocation = pic.Location ; //Adding this bubble to the dictionary. //Right now nobody popped it //So the status is false. bubbleStatus.Add(pic.Name, false); for (int r = 1; r <= perCol; r++) { firstLocation = pic.Location; for (int c = 1; c <= perRow; c++) { //Creating bubbles on the fly pic = new PictureBox(); pic.BackgroundImage = Image.FromFile("C:\bubbleBackGround.jpg"); pic.Name = "pic" + (r*10+c).ToString (); pic.Cursor = System.Windows.Forms.Cursors.Hand; pic.Width = 45; pic.Height = 45; pic.ImageLocation = "C:\bubble.jpg"; pic.Click += new EventHandler(pic_Click); //Checking if There is already a bubble //in the dictionary with this ID if (!bubbleStatus.ContainsKey(pic.Name)) bubbleStatus.Add(pic.Name, false); else { //Change the ID arbitrarily. It doesn't //have to be sequential //We are just going to need a way to //access the status //of a bubble given its ID pic.Name = pic.Name + c.ToString(); //add this new bubble to the dictionary bubbleStatus.Add(pic.Name, false); } //Are we at the edge? //If not we can still go on adding on the same column if (c % perRow != 0) pic.Location = new Point(lastLocation.X + pic.Width,lastLocation.Y); else //OOPs! we fell off the edge, //we ran out of space to render any more //bubbles in the current row. So lets go back to the second //row where we started. pic.Location = new Point(firstLocation.X , firstLocation.Y + pic.Height); //add the current bubble to the controls of the form this.Controls.Add(pic); //Remember where you added this last bubble. lastLocation = pic.Location; } } } void pic_Click(object sender, EventArgs e) { PictureBox pic = (PictureBox)sender; if (bubbleStatus[pic.Name] == false) { //This bubble is Popped!! bubbleStatus[pic.Name] = true; //Play bubble wrap pop sound System.Media.SoundPlayer p = new System.Media.SoundPlayer("C:\BubblePop.wav"); p.Play(); //Change the image to give an impression that it //actually popped. pic.ImageLocation = "C:\bubblePopped.jpg"; } }
See, every bubble can either be normal or already popped. So the status of a bubble in the game board is binary. Thus, it is best to describe their status using a Boolean field. Now, how do we identify one bubble from the other? Well, every bubble has to have an ID. As we don't know how many bubbles we shall have, depending on the screen size the number may vary, so generating the ID at runtime seems an obvious choice.
We could also have done it using a list. However, if we did it using a list, identifying which bubble has just popped would have been time consuming and difficult as we would have needed a sequential search. Dictionary, on the other hand, offers very fast access as it keeps the entries indexed by the key. In this case, Key
is the Name
of the picture boxes that show the bubble images.
The dictionary: Dictionary<string, bool> bubbleStatus = new Dictionary<string, bool>();
stores the status of each bubble on the board. When a new bubble is created and gets its Name
, then that is naturally not popped. At this point, the bubble is added to the dictionary as follows:
bubbleStatus.Add(pic.Name, false);
A false
status means that the bubble has still not been popped. When the bubble is clicked, a check is made to see whether that bubble has already popped or not—by de-referencing the bubble status by its name—using the following code:
if (bubbleStatus[pic.Name] == false)
However, if this condition evaluates to be true, then the program changes the status of the bubble from false
to true
or conceptually from normal to popped if you assign the value true
to the status using the following code snippet:
bubbleStatus[pic.Name] = true;
I thought it would be a good idea to see how the dictionary stores the elements in runtime, here is a snapshot of the middle of a game:
Check that the first, third, fifth, and sixth bubbles are popped. Check their status from the dictionary. The first entry is highlighted. |
You can see the game in action at http://sudipta.posterous.com/bubble-wrap-popper-in-c.
We live in an interesting time and remembering everything exactly could be very challenging. For example, imagine yourself in the following scenarios:
Coding in C# without intelligence support.
Trying to find an Integrated Circuit (IC) from the local electronics store for your school project, but you can't remember the exact number of the IC.
You work in a drugstore and a customer only knows first few letters of a drug. There are hundreds that start with those and you can't guess.
You work for a travel website and customers are expecting intelligent drop-downs, where city names get completed as they type.
All these situations are example of autocomplete feature that completes the entry for the user. It is easy to create a generic structure using the generic C# dictionary that can support autocomplete.
The objective of the design is that it has to be user driven. Once written, we should be able to support the autocomplete facility for any custom list of values.
Follow the given steps:
Create a class library project. Call it MyDictionary
.
Rename Class1.cs
to MultiDictionary.cs
.
Modify the MultiDictionary
class header as follows:
public class MultiDictionary<Tkey,TValue> where Tkey : IComparable
Add the following variable and the constructor:
Dictionary<Tkey, List<TValue>> innerDS; public MultiDictionary() { innerDS = new Dictionary<Tkey, List<TValue>>(); }
Add the following method to add a single entry to the dictionary:
public void Add(Tkey key, TValue val) { if (!innerDS.ContainsKey(key)) { List<TValue> values = new List<TValue>(); values.Add(val); innerDS.Add(key, values); } else { innerDS[key].Add(val); } }
Add the following method:
/// <summary> /// Find all the values for a given key /// </summary> /// <param name="key">The given key</param> /// <returns>Values associated with this key</returns> public IEnumerable<TValue> Values(Tkey key) { List<TValue> values = new List<TValue>(); innerDS.TryGetValue(key, out values); return values; }
Add the reference of MyDictionary
to this console application project and add the following lines to the Program.cs
:
using MyDictionary;
And in the Main()
method
, add the following code:
static void Main(string[] args) { MultiDictionary<string, string> authorBookMap = new MultiDictionary<string, string>(); authorBookMap.Add("Sudipta", "Data Structure using C"); authorBookMap.Add("Sudipta", ".NET Generics Beginners Guide"); var booksBySudipta = authorBookMap.Values("Sudipta"); foreach (string bookTitle in booksBySudipta) Console.WriteLine(bookTitle); Console.ReadLine(); }
Compile and run the console application project.
As you execute the program, you will see the following output:
Data Structure using C .NET Generics Beginners Guide
You might be thinking what's the point in discussing this in the autocomplete app? The reason is, for autocomplete we need to be able to map multiple entries to be associated with a single tag. In other words, we should be able to have duplicate keys in the structure.
Till now, we have used dictionaries to map one variable with another single variable. That's one-to-one mapping to be precise. However, for autocomplete, we need a one-to-many mapping capability. So, we should be able to map each key of the dictionary with a list of values. That's exactly what I did in this class MultiDictionary
.
Pictorially MultiDictionary
will look similar to the following diagram:
In the Add()
method of the MultiDictionary
class, it checks whether the Key
is already present. If it is already present, we just add the value to the associated list of that Key
. However, if it is not present, we create a blank list, add the new value to that list and then associate that list with the given key. However, the consumers of MultiDictionary
get an impression that they can have duplicate keys.
Let's create an application showing how MultiDictionary
can be used to offer an autocomplete feature.
Follow the given steps:
Stay in the class library project where we created the MultiDictionary
project. Add a class called AutoComplete.cs
.
Mark the class as public and add using System.IO;
in the using
directive list:
public class AutoComplete
Add the following variables and the constructor:
List<string> values = new List<string>(); MultiDictionary<string, string> autoCompleteMap; string sourceFile; int minimumLength; public AutoComplete(string file, int min) { autoCompleteMap = new MultiDictionary<string, string>(); sourceFile = file; minimumLength = min; StreamReader reader = new StreamReader(sourceFile); string line = string.Empty; while ((line = reader.ReadLine()) != null) { if (line.Length > minimumLength) autoCompleteMap.Add(line.Substring(0, minimumLength), line); } reader.Close(); }
Add the following method:
public List<string> GetSuggestions(string initial) { if (initial.Length == minimumLength) { values.Clear(); List<string> vals = autoCompleteMap.Values(initial).ToList(); values.AddRange(vals); } if (initial.Length > minimumLength) { List<string> currentItems = new List<string>(); foreach (object s in values) currentItems.Add(s.ToString()); values.Clear(); foreach (string k in currentItems) if (k.StartsWith(initial)) values.Add(k); } return values; }
Now, create a Windows application to test this. Add a textbox (textBox1
) as follows:
Now, add these two variables and add a reference to MyDictionary
, and add the following using
directive also:
using MyDictionary; ListBox suggestionBox = new ListBox(); AutoComplete suggester;
Add code for the form_Load
event
as follows:
private void Form1_Load(object sender, EventArgs e) { suggestionBox.Font = textBox1.Font; suggestionBox.DoubleClick+=new EventHandler (suggestionBox_DoubleClick); suggestionBox.KeyDown += new KeyEventHandler(suggestionBox_KeyDown); suggester = new AutoComplete("UK_Cities.txt", 2); }
Add the following event:
private void textBox1_TextChanged(object sender, EventArgs e) { if (textBox1.Text.Length < 2) { suggestionBox.Visible = false; suggestionBox.Items.Clear(); } else { try { List<string> values = suggester.GetSuggestions(textBox1.Text); suggestionBox.Items.Clear(); foreach (string value in values) suggestionBox.Items.Add(value); ShowListBox(); } catch (Exception ex) { return;//Don't do anything. } } }
Add the following method:
private void ShowListBox() { suggestionBox.Location = new Point(textBox1.Location.X, textBox1.Location.Y + textBox1.Height); suggestionBox.Width = textBox1.Width; suggestionBox.Height = 10 * textBox1.Height; suggestionBox.Visible = true; this.Controls.Add(suggestionBox); suggestionBox.BringToFront(); }
Add the following events:
void suggestionBox_KeyDown(object sender, KeyEventArgs e) { if (e.KeyData == Keys.Enter) { textBox1.Text = suggestionBox.Text; suggestionBox.Visible = false; } } private void suggestionBox_DoubleClick(object sender, EventArgs e) { textBox1.Text = suggestionBox.Text; suggestionBox.Visible = false; } private void textBox1_KeyDown(object sender, KeyEventArgs e) { if (e.KeyData == Keys.Down) { suggestionBox.Focus(); if (suggestionBox.Visible) suggestionBox.SelectedIndex = 0; } }
Compile and run the program.
As you execute the program, you will see the following output:
As you start typing, after two characters you shall received autocomplete suggestions, as shown in the following screenshot:
You can find a video of this output in my blog at: http://sudipta.posterous.com/auto-complete-feature-demo-video-written-in-c#.
The code described in steps 9 and 10 is for making the autocomplete feature more useful. As you press the down arrow key, it will set the focus on the listbox so that you can scroll for the right one in the suggestions. If you press Enter while one entry in the listbox is highlighted, that entry will be copied to the textbox and the same will happen if you double-click any item in the listbox. These make the program more real-world ready.
The heart and soul of this program is the following code snippet:
suggester = new AutoComplete("UK_Cities.txt", 2);
This creates an AutoComplete
object. This tells the program that it must populate the entries from the file Uk_Cities.txt
and it should offer suggestions as soon as the first two characters are typed in the textbox.
In the constructor of the AutoComplete
class, the following code snippet:
autoCompleteMap.Add(line.Substring(0, minimumLength), line);
stores the words in a MultiDictionary
where Keys
of the MultiDictionary
are prefixes of length minimumLength
. One entry in this MultiDictionary
is as follows:
The most interesting method is the GetSuggestion()
method as it calculates the list of suggestions with every changing few initial letters. If the length of the initial text is more than the minimum length for offering a suggestion, then the already suggested strings are filtered depending on whether it starts with the same pattern as the input string.
For example, as soon as you type Gl in the box, you will see Glasgow and Gloucester in the list as they both start with Gl. However, if you keep typing, then the input string of Gla will only match Glasgow and show that in the suggestion box.
While we deal with dictionaries, we should try to avoid the biggest possible error that might crash a program at runtime. Let's see how.
The most common pitfall is what is known as a hole problem. It means you are trying to access the value for a key from a dictionary which doesn't exist. This throws a KeyNotFoundException
. In order to avoid this, you can use the TryGetValue()
method that first checks whether the key is present and then tries to de-reference that.
Here is a typical use:
Dictionary<string, int> histogram = new Dictionary<string, int>(); histogram.Add("a", 1); histogram.Add("b", 2); //The following call will throw KeyNotFoundException int x = histogram["c"];int y; //The following call will set y to default value for int, which is 0 histogram.TryGetValue("c", out y);
The piano is one of the closest analogies of a C# Dictionary (at least conceptually) in the real world. Every key of the piano is associated with a particular note. In this example, we shall create a table-top piano with 12 keys (A, B, C, D, E, F, G, B#, C#, E#, F#, G#) as shown in the following screenshot:
This would have the following functionalities:
Users should be able to play it by clicking on the keys
Users should be able to play it by pressing keys on the keyboard
Users should be able to customize the key settings to their preference
It should be able to record a note and play it back
It should show labels for each key on demand (as shown in the preceding screenshot)
The preceding screenshot might look very fancy, but it is composed with only winform buttons and labels. I have skipped the details for laying out the controls and their event handlers. However, you can find all that from the website including the background image and the .wav
files used.
Create a Windows form. Place the controls as shown in this video: http://sudipta.posterous.com/private/eweuvoFJwb. Eventually, your layout should look like the one shown in the preceding screenshot. As it is a complex layout, I have created the video to help you.
Add the following variables to Form1.cs
:
bool startRecording = false;
This will be set to true
when the app starts recording keystrokes.
Add the following field to the Form1.cs
:
Add the following methods:
/// <summary> /// Plays the .WAV file associated with the pressed key /// </summary> /// <param name="pianoKey">The piano key that is pressed</param> private void PlayPiano(string pianoKey) { try { System.Media.SoundPlayer p = new System.Media.SoundPlayer(pianoNotes[pianoKey]); p.Play(); } catch (KeyNotFoundException ex) { return; } } /// <summary> /// This method fills the key settings /// with recommended values by yours truly! /// Every time the program starts these values will be loaded. /// So even if the users make any customization, for now, that /// will not be sticky. /// </summary> private void FactoryReset() { keySettings.Add("A", 'A'), keySettings.Add("B", 'B'), keySettings.Add("C", 'C'), keySettings.Add("D", 'D'), keySettings.Add("E", 'E'), keySettings.Add("F", 'F'), keySettings.Add("G", 'G'), keySettings.Add("B#", 'Q'), keySettings.Add("C#", 'W'), keySettings.Add("E#", 'R'), keySettings.Add("F#", 'T'), keySettings.Add("G#", 'H'), } /// <summary> /// Populates the drop down boxes with the recommended keys /// </summary> private void PopulateKeys() { cmbA.Text = alphabet[alphabet.IndexOf(keySettings["A"])].ToString(); cmbB.Text = alphabet[alphabet.IndexOf(keySettings["B"])].ToString(); cmbC.Text = alphabet[alphabet.IndexOf(keySettings["C"])].ToString(); cmbD.Text = alphabet[alphabet.IndexOf(keySettings["D"])].ToString(); cmbE.Text = alphabet[alphabet.IndexOf(keySettings["E"])].ToString(); cmbF.Text = alphabet[alphabet.IndexOf(keySettings["F"])].ToString(); cmbG.Text = alphabet[alphabet.IndexOf(keySettings["G"])].ToString(); cmbBSharp.Text = alphabet[alphabet.IndexOf(keySettings["B#"])].ToString(); cmbCSharp.Text = alphabet[alphabet.IndexOf(keySettings["C#"])].ToString(); cmbESharp.Text = alphabet[alphabet.IndexOf(keySettings["E#"])].ToString(); cmbFSharp.Text = alphabet[alphabet.IndexOf(keySettings["F#"])].ToString(); cmbGSharp.Text = alphabet[alphabet.IndexOf(keySettings["G#"])].ToString(); }
Add the following code in Form_Load
:
private void Form1_Load(object sender, EventArgs e) { //Associating the .WAV files with the keystrokes pianoNotes.Add("A", "A.wav"); pianoNotes.Add("B", "B.wav"); pianoNotes.Add("B#", "B#.wav"); pianoNotes.Add("C", "C.wav"); pianoNotes.Add("C#", "C#.wav"); pianoNotes.Add("D", "D.wav"); pianoNotes.Add("E", "E.wav"); pianoNotes.Add("E#", "E#.wav"); pianoNotes.Add("F", "F.wav"); pianoNotes.Add("F#", "F#.wav"); pianoNotes.Add("G", "G.wav"); pianoNotes.Add("G#", "G#.wav"); FactoryReset(); PopulateKeys(); }
Add event handlers for all the buttons (representing piano keys) as follows:
private void btnA_Click(object sender, EventArgs e) { //Showing what key is pressed. lblKeyStroke.Text = "A"; PlayPiano("A"); //If user intended to start recording, //let's remember this key stroke. if (startRecording) keyPressRecord.Add (new KeyValuePair<string, DateTime>("A", DateTime.Now)); }
The helper button's name is btnShowLables
and add the following event handler for that:
private void btnShowLables_Click(object sender, EventArgs e) { if (btnShowLables.Text == "Help") { keyLabelA.Visible = true; keyLabelB.Visible = true; keyLabelC.Visible = true; keyLabelD.Visible = true; keyLabelE.Visible = true; keyLabelF.Visible = true; keyLabelG.Visible = true; keyLabelBSharp.Visible = true; keyLabelCSharp.Visible = true; keyLabelESharp.Visible = true; keyLableFSharp.Visible = true; keyLabelGSharp.Visible = true; grpKeySettings.Visible = true; btnShowLables.Text = "Helping"; } else { keyLabelA.Visible = false; keyLabelB.Visible = false; keyLabelC.Visible = false; keyLabelD.Visible = false; keyLabelE.Visible = false; keyLabelF.Visible = false; keyLabelG.Visible = false; keyLabelBSharp.Visible = false; keyLabelCSharp.Visible = false; keyLabelESharp.Visible = false; keyLableFSharp.Visible = false; keyLabelGSharp.Visible = false; grpKeySettings.Visible = false ; btnShowLables.Text = "Help"; } }
Once you compile and run the application, it will show the piano interface. You can click on each key to play.
Now, let's see what our code can do so far.
These two dictionaries: Dictionary<string, string> pianoNotes = new Dictionary<string, string>();
and Dictionary<string, char> keySettings = new Dictionary<string, char>();
are the heart and soul of this application.
The first one, pianoNotes
, associates a piano key to a given .wav
file. So, as you click on that button, that associated media file (.wav
) could be played.
The second one, maps the keyboard keys to the piano keys. For example, there is no key on the keyboard to directly show "C#". So the key W is mapped to "C#". All these are being done in the FactoryReset()
method.
As soon as you click on any button, the PlayPiano()
method plays the associated media file.
This is done using key-based indexing of Dictionaries. For example, if you click on btnA
then pianoNotes[pianoKey]
will return the value of the key A in the dictionary pianoNotes
, which is A.wav
.
Now, suppose you want to play the "C#" note. Which keyboard key should you click on? The answer is you should click btnW
or press W on the keyboard; which is the associated keyboard key for piano key "C#".
This type of single unique key and value association is known as one-to-one mapping. And the type of dictionary we created for autocomplete is called one-to-many mapping.
Well, we are using List<KeyValuePair<string, DateTime>> keyPressRecord = new List<KeyValuePair<string,DateTime>>();
for that. As the same piano note can be played many times during a session, we can't use a dictionary to record keystrokes, because the dictionary can't have duplicate keys. So, the solution is a list of KeyValuePair<TKey,TValue>
class objects that allows us to store KeyValuePair
with the same key. The value type is DateTime
, because we need to remember exactly when which key was pressed, if we need to play it back. The duration between each such key press will be determined by finding the difference between the values of consecutive DateTime items.
When I played and recorded my keystrokes, for a while it was stored in the list of key-value pairs as follows:
So you see, I waited two seconds after I hit the first key A. So while playing it back, we must play the second note (which is B in this case) after two seconds of playing the first note.
Follow the given steps:
Add the following method in Form1.cs
:
/// <summary> /// Switch on recording. /// </summary> private void btnRecord_Click(object sender, EventArgs e) { if (btnRecord.Text.Equals("Record")) { //let's remember this key stroke. keyPressRecord.Clear(); //We need to start recording. startRecording = true; btnRecord.Text = "Stop"; } else { startRecording = false ; btnRecord.Text = "Record"; } }
Add the following method and event in Form1.cs
:
private void SleepInBetween(TimeSpan span) { System.Threading.Thread.Sleep(span.Minutes); System.Threading.Thread.Sleep(span.Seconds); System.Threading.Thread.Sleep(span.Milliseconds); } /// <summary> /// Plays the recorded keystrokes /// </summary> private void btnPlay_Click(object sender, EventArgs e) { try { int i; TimeSpan span; for (i = 0; i < keyPressRecord.Count - 1; i++) { PlayPiano(keyPressRecord[i].Key); span = keyPressRecord[i + 1].Value .Subtract(keyPressRecord[i].Value); //Lets wait till the timespan between //these two key-strokes are spent. SleepInBetween(span); } span = keyPressRecord[i].Value.Subtract(keyPressRecord[i - 1].Value); SleepInBetween(span); PlayPiano(keyPressRecord[i].Key); } catch (Exception ex) { //Let's go back return; } }
We are iterating over the recorded keystrokes using this loop: for (i = 0; i < keyPressRecord.Count - 1; i++)
. keyPressRecord[i].Key
gives the ith key pressed by the user from the start.
keyPressRecord[i + 1].Value.Subtract(keyPressRecord[i].Value);
returns the time difference between the two consecutive recorded keystrokes. The program needs to sleep during this time. Once we reach out of this loop, we need to play the last recorded keystroke. You can access the key
and value
of a KeyValuePair
by public property Key
and Value
.
You can see the piano being played at http://sudipta.posterous.com/my-table-top-piano-12-keys.
Now that you are aware of the KeyValuePair
, let's see another interesting situation where this structure can be used.
Pattern recognition is an emerging science. K Nearest Neighbor (popularly known as KNN) is a supervised learning algorithm that can be used in binary classification problems very efficiently.
A binary classification problem is just what it states. It is a classification problem. Typically, classification of several entries are previously known and depending on that a new entry has to be classified in either of the two categories. Thus it is called binary.
Suppose we have records of several patients who are diagnosed with either malignant (class 'M' cancerous) or benign (class 'B' not cancerous) cases. Now, we gather details from the tests for a new patient. Depending on what we know previously from past patient records, can we classify the new patient's case as either a malignant case or a benign case? That's where binary classification helps in real life.
KNN uses a voting mechanism to solve this problem. It considers test results as a vector. If two vectors, representing two patient's test records, are near then they are probably suffering from the same type of case (either benign or malignant). Now this mechanism can lead to confusing results sometimes, if only the nearest vector is considered. So, K nearest vectors are considered. Thus the name KNN.
In this example, we shall see how C# Dictionary-based data structures can be used to solve this problem.
Create a class library project.
Add a class to the project. Call it Entry
. Add the following code:
public class Entry { public string Id { get; set; } public string Tag { get; set; } public List<double> ParamValues { get; set; } }
Add another class to this project. Call it KNN
.
Add the following variable and the constructor:
/// <summary> /// Adds a patient record to the list /// </summary> /// <param name="entry"></param> public void Add(Entry entry) { map.Add(new KeyValuePair<string,List<double>>(entry.Tag, entry.ParamValues)); } /// <summary> /// Adds an entry to the list /// </summary> /// <param name="tag">Class of the new entry</param> /// <param name="entries">Patient Records</param> public void Add(string tag, List<double> entries) { map.Add(new KeyValuePair<string, List<double>>(tag, entries)); } /// <summary> /// Calculates the Euclidean Distance between two entries /// </summary> /// <param name="index1">Index of the first entry in the /// list</param> /// <param name="index2">Index of the second entry in the /// list</param>/// <returns> /// To know more about Euclidean Distance please refer/wiki/Euclidean_distance /// </returns> private double distance(int index1,int index2) { double sum = 0; for (int i = 0; i < map[index1].Value.Count; i++) { sum += Math.Pow(Math.Round(map[index1].Value[i] – map[index2].Value[i],2),2); } return Math.Round(Math.Sqrt(sum),2); }
/// <summary> /// Predicts the suspected class of the input data /// </summary> /// <param name="entries">Values. In this case value of several /// parameters</param> /// <param name="class1">The first of the binary classes this data /// set can belong to. In this example, either "B" or "M" /// </param> /// <param name="class2">The second of the binary classes this /// data set can belong to. In this example either "B" or "M" /// </param> /// <param name="k">Number of nearest neighbor for we need to /// consider to conclude a class.</param> /// <remarks>For more information visit http://en.wikipedia.org//// wiki/K-nearest_neighbor_algorithm</remarks> public string Predict(List<double> entries, string class1, string class2, int k) { //Dictionary to keep the entries indexed and sorted by their //distance from the new values SortedDictionary<double, string> distanceMap = new SortedDictionary<double, string>(); int count = 0; this.Add("X", entries);//Right now it is unknown int lastIndex = map.Count - 1; int firstIndex = 0; //Lets start counting at 0 double minDistance = distance(firstIndex, lastIndex); for (int i = firstIndex + 1; i < lastIndex - 1; i++) { double currentDistance = distance(i, lastIndex); if(!distanceMap.ContainsKey(currentDistance)) distanceMap.Add( currentDistance,map[i].Key); } map.RemoveAt(map.Count - 1);//lets remove the last entry Dictionary<double, string> kVals = new Dictionary<double, string>(); foreach (double key in distanceMap.Keys) { if (count == k) break; kVals.Add(key, distanceMap[key]); } //Finding Class of the new entry "class1" int class1Count = 0; int class2Count = 0; foreach (double key in kVals.Keys) { if (kVals[key] == class1) class1Count++; else class2Count++; } return class1Count > class2Count ? class1 : class2; }
You can compile the project but can't run it yet. It doesn't have any data. We shall get to that in a while. Let's see what we created.
The class Entry
represents a particular patient's record. Records from different tests are obtained and we are using numeric record values.
The class KNN
implements the KNN
algorithm. The inner data structure of the KNN
class is: List<KeyValuePair<string, List<double>>>
.
Let's see why. We have a list of patient records, where the patients might be suffering from either of the two cases and there could be multiple patients suffering from the same class of disease. So a plain dictionary is ruled out as it can't support duplicate keys. We can't use a MultiDictionary
—described earlier in the chapter—because we would need integer indexing. So, we need a key-value pair list where the key
is the class of disease and the value
is the list of values of different test results.
So pictorially, we are using the structure shown in the following screenshot:
The Predict()
method is doing all the magic. It predicts the possible class of the new patient entry. It uses a sorted dictionary to keep a list of entries in the ascending order of their distance from the new entry:
SortedDictionary<double, string> distanceMap = new SortedDictionary<double, string>();
This dictionary keeps the class tags and the distance of the patient's record data entries from the newly-added unidentified one, ordered by the distance because tags will be duplicate, however, distance can't be in most instances. And if they are, we don't need to add that entry to the dictionary as it actually is a duplicate entry.
So the entries in this dictionary will look similar to the following:
The index on the left is the value for k
. So the first entry k=1
is the closest to an entry which is of type M.
Once we have this dictionary in the Predict()
method, we need to consider only the first k
entries of the SortedDictionary distanceMap
. If the number of entries among these first k
elements having tag value class1
is more than that of entries having tag value class2
, then the new unidentified entry will be marked as class1
else it will be marked as class2
. That's how the voting works.
Download patient records from the following website:
http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data..
If you are interested to know what are these values check out: http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.names.
Copy about one-third of the entries to another file and delete them from this one. Save the new file as wdbctest.txt
and the left over entries in the original file as wdbc.txt
. You can get these files from the book website.
Stay in the class library project and add a class called TextReader.cs
:
public class TextReadHelper { private string fileName;//Delimited source file that has // patient records private int tagIndex; //Index of the class of disease this // patient is suffering private int idIndex;//Index of the ID of the patient in // the delimited file string delim; //delimiter public TextReadHelper(string fileName,int tagIndex,int idIndex,string delim) { //id,tag,values //tag,values this.fileName = fileName; this.tagIndex = tagIndex; this.idIndex = idIndex; this.delim = delim; } /// <summary> /// Returns the list of patient records /// </summary> public List<Entry> GetEntries() { //Include System.IO for this line of code StreamReader sr = new StreamReader(fileName); string line = string.Empty; List<Entry> entries = new List<Entry>(); while ((line = sr.ReadLine()) != null) { string[] tokens = line.Split(new string[] {delim}, StringSplitOptions.RemoveEmptyEntries); Entry current = new Entry(); if(idIndex!=-1) current.Id = tokens[idIndex]; current.Tag = tokens[tagIndex]; current.ParamValues = new List<double>(); for (int i = 2; i < tokens.Length; i++) current.ParamValues.Add(Convert.ToDouble(tokens[i])); entries.Add(current); } sr.Close(); return entries; } }
Create a console application and add a reference of the class library we built earlier.
Add the following using
directive at the beginning:
using ClassLibrary1;
Add the following code in the Main()
method of the console application:
static void Main(string[] args) { TextReadHelper helper = new TextReadHelper("C:\wdata.txt",1,0,","); KNN knnHelper = new KNN(); List<Entry> entries = helper.GetEntries(); foreach (Entry current in entries) knnHelper.Add(current); int k = 3; string tag = knnHelper.Predict(new List<double>(){21.56, 22.39, 142, 1479, 0.111, 0.1159, 0.2439, 0.1389, 0.1726, 0.05623, 1.176, 1.256, 7.673, 158.7, 0.0103, 0.02891, 0.05198, 0.02454,0.01114, 0.004239, 25.45, 26.4, 166.1, 2027, 0.141, 0.2113, 0.4107, 0.2216, 0.206,0.07115 },"M", "B", k); Console.WriteLine("Suspected diagnoses " + tag); Console.ReadLine(); }
As you execute the program, it will print the following output:
Suspected diagnoses M
The call to the
Predict()
method takes only a new patient's test data record, saves the patient ID and tag
. k
is set to 3. So in this case, we are considering three nearest neighbors for reaching a conclusion about the possible disease class the new patient is suffering from.
I thought it would be a good idea to see the final dictionary of distance mapped entries. So, here it is for the call we made from Main()
:
Tuples are a new inclusion in the .NET 4.0 framework and they are a perfect fit for many situations, some of them are as follows:
To represent a database table row in the code.
To replace all those dummy classes that have no methods and are just acting as placeholders for several types of objects.
To bring association between more than a couple of types. Have you ever found yourself using Dictionary of Dictionaries? Then you shall find Tuples really useful.
Tuples can be used to refactor a nested branching statement. In this example, we shall create a number rearranging game that I used to play in my childhood.
Follow the given steps:
Create a Windows application. Call it TilO
.
Add nine button controls. Also add one label
control and one pictureBox
control, as shown in the following screenshot. Add large borders to buttons ("10"), make them flat, and name them as follows:
Add the following variables in the Form1.cs
:
bool mute = false; int totalMoves = 0; List<Tuple<int, int, int>> board = new List<Tuple<int, int, int>>(); Dictionary<int, Button> buttons = new Dictionary<int, Button>();
Add the following methods to draw the board
properly. You can customize the colors:
private void DrawBoard() { int emptyIsAt = board.First(t => t.Item3 == 0).Item1; DrawEmpty(emptyIsAt); //Draw Rest for (int i = 0; i < board.Count; i++) { if (board[i].Item3 != 0) { DrawOthers(board[i].Item1, board[i].Item3); } } } private void DrawOthers(int index,int number) { buttons[index].Text = number.ToString(); buttons[index].BackColor = Color.White; } private void DrawEmpty(int index) { buttons[index].BackColor = Color.Black; buttons[index].Text = string.Empty; }
Add the following methods to check whether the tiles the players want to swap can actually move or not. If so, the second method DoMove()
handles the movement:
private bool CanMove(int first, int second) { if (first != 0 && second != 0) return false; int firstIndex = board.First(tuple => tuple.Item3 == first).Item1; int secondIndex = board.First(tuple => tuple.Item3 == second).Item1; int absDiff = Math.Abs(firstIndex - secondIndex); //Handle Edge cases if (absDiff == 1 && Math.Min(firstIndex, secondIndex) % 3 == 0) return false; if (absDiff == 1 || absDiff == 3) return true; else return false; } private void DoMove(string moveCommand, bool silent) { string[] moveThese = moveCommand.Split(new char[] { '=' }, StringSplitOptions.RemoveEmptyEntries); //If some player just clicks the Empty box then it will //generate //an invalid command "0". And in this time moveThese will be //of length 1. if (moveThese.Length == 2) { int first = Convert.ToInt16(moveThese[0]); int second = Convert.ToInt16(moveThese[1]); bool CanTheseBeMoved = CanMove(first, second); if (CanTheseBeMoved) { //Move int firstIndex = board.First(t => t.Item3 == first).Item1; int secondIndex = board.First(t => t.Item3 == second).Item1; int expectedFirst = board.First(t => t.Item3 == first).Item2; int expectedSecond = board.First(t => t.Item3 == second).Item2; Tuple<int, int, int> newFirstTuple = new Tuple<int, int, int>(firstIndex, expectedFirst, second); Tuple<int, int, int> newSecondTuple = new Tuple<int, int, int>(secondIndex, expectedSecond, first); board.RemoveAt(firstIndex - 1); board.Insert(firstIndex - 1, newFirstTuple); board.RemoveAt(secondIndex - 1); board.Insert(secondIndex - 1, newSecondTuple); if (!mute) { System.Media.SoundPlayer player = new System.Media.SoundPlayer("Blip.wav"); player.Play(); } totalMoves++; lblMoves.Text = String.Format("You have made {0} moves so far.", totalMoves); } else { if (!silent && !mute) { System.Media.SoundPlayer player = new System.Media.SoundPlayer("Error.wav"); player.Play(); } } } }
Add the following methods to initialize board
and buttons
:
private void InitializeStartUpBoard() { board.Add(new Tuple<int, int, int>(1, 1, 8)); board.Add(new Tuple<int, int, int>(2, 2, 7)); board.Add(new Tuple<int, int, int>(3, 3, 6)); board.Add(new Tuple<int, int, int>(4, 4, 5)); board.Add(new Tuple<int, int, int>(5, 5, 4)); board.Add(new Tuple<int, int, int>(6, 6, 3)); board.Add(new Tuple<int, int, int>(7, 7, 2)); board.Add(new Tuple<int, int, int>(8, 8, 1)); board.Add(new Tuple<int, int, int>(9, 0, 0)); } private void InitializeNumberButtonMap() { buttons.Add(1, btn1); buttons.Add(2, btn2); buttons.Add(3, btn3); buttons.Add(4, btn4); buttons.Add(5, btn5); buttons.Add(6, btn6); buttons.Add(7, btn7); buttons.Add(8, btn8); buttons.Add(9, btn9); }
Add these event handlers to handle the events of button Click
and volume control pictureBox Click
:
void but_Click(object sender, EventArgs e) { Button button = (Button)sender; DoMove(String.Format("{0}=0", button.Text), false); DrawBoard(); GameOverYet(); } //This is for volume toggle. You can skip it. private void pictureBox1_Click(object sender, EventArgs e) { if (mute) pictureBox1.ImageLocation = @"Volume_img.jpg"; else pictureBox1.ImageLocation = @"Mute_img.JPG"; //Switch the state mute = !mute; }
Add the following code to randomize the board. Getting the same initial board every time is not what we want:
private void RandomizeBoard() { //generate random move commands and move them whenever possible //finally return the modified board string randomMoveCommand = string.Empty; int total = new Random().Next(100); for (int i = 0; i < total; i++) { try { randomMoveCommand = new Random().Next(9).ToString() + "=0"; System.Threading.Thread.Sleep(10); DoMove(randomMoveCommand, true); } catch { //This randomly generated move is illegal. Don't worry, just go. continue; } } }
Add the following code for the Form1_Load
event:
private void Form1_Load(object sender, EventArgs e) { //Attach the event handler for al the buttons this.Controls.OfType<Button>().ToList() ForEach(but => but.Click+=new EventHandler(but_Click)); //Show the Volume Control image pictureBox1.ImageLocation = @"Volume_img.JPG"; buttons.Clear(); board.Clear(); InitializeNumberButtonMap(); InitializeStartUpBoard(); RandomizeBoard(); lblMoves.Text = "How many moves do you think ?"; //When we were randomizing the board, there were some moves //(probably) //But we want to give the player a fresh start. //So set the total Moves count to Zero. //totalMoves = 0; //Draw the initial Board DrawBoard(); }
Add the following method:
private void GameOverYet() { if (board.Where(tuple => tuple.Item2 != tuple.Item3).Count() == 0) { MessageBox.Show("Game Over! Congratulations!"); } }
As you execute the program, you will see something similar to the following screenshot:
If you click on any of the numbered tiles, if it is adjacent to the black empty tile, it will swap its position with the empty tile. For example, in this case 5, 7, 4, and 1 are only eligible to swap their positions. As a tile moves, it will emit a sound. If you don't want sound, click on the speaker icon at the bottom-right.
You can take a look at the completed game demo video at my blog: http://sudipta.posterous.com/number-re-arrange-game-an-application-of-tupl.
The board at any time can be represented as a list of three different integers attached together:
The index of the tile
The expected value of the tile in order for the game to end
The actual value of the tile in that index right now
Tips!
Tuples are a great way of representing database tables. There are eight overloaded versions of Tuple constructors. You can use one of them or use the static Create()
method of the Tuple
class. The last parameter for the last overload is another Tuple itself. So, you can plug in another Tuple if you need more than seven parameters.
For example, the board we have now and our goal (to finish the game) board are as follows:
If we had to represent this without Tuples, we would have had to create a dump placeholder class with three values and create a list of such class objects. But the construct: List<Tuple<int, int, int>> board = new List<Tuple<int, int, int>>();
helps remove that special need. A list of Tuples represents the game board
in this case. The first item of the Tuple is the index of the tile, the second is the expected numeric value on that tile, the third is the actual numeric value on that tile right now.
The InitializeStartUpBoard()
method initializes the board
with some default values. The next RandomizeBoard()
method
randomizes the board
using some randomly generated moves. Some of these random moves might/will fail and that's ok.
Here is how the board looks after initialization is complete and randomization is yet to start:
One disadvantage some people find using Tuples is that they reduce the readability of the code, because we can't name the items.
The method DoMove()
moves two tiles visually. Actually, it swaps those two Tuples with changed entries. The following code:
int firstIndex = board.First(t => t.Item3 == first).Item1;
finds the index of the first integer in the command passed. So, if the command passed is 3=0
then first is 3
and firstIndex
will store the index of tile with text 3
in the board.
First()
is an extension method. We shall discuss this in detail in the next chapter, but for now just use it as it makes the code much cleaner.
The following lines:
Tuple<int, int, int> newFirstTuple = new Tuple<int, int, int>(firstIndex, expectedFirst, second); Tuple<int, int, int> newSecondTuple = new Tuple<int, int, int>(secondIndex, expectedSecond, first);
create two Tuples with swapped actual values. Now, the older Tuples are deleted and replaced with these new Tuples in their positions by the following lines:
board.RemoveAt(firstIndex - 1); //removes the old tuple board.Insert(firstIndex - 1, newFirstTuple); //Add the new tuple board.RemoveAt(secondIndex - 1); board.Insert(secondIndex - 1, newSecondTuple);
Inserting at the correct position is very important because the DoMove()
method relies on it.
The game will be over if Item2
and Item3
for each Tuple on the board become the same. Because Item2
is the expected value of the tile and Item3
is the current value, if there is no such tile that has a mismatch between Item2
and Item3
, then we can conclude that the game is over.
board.Where(tuple => tuple.Item2 != tuple.Item3)
returns all those Tuples as an IEnumerable<Tuple<int,int,int>>
where Item2
and Item3
are not the same. If the count of such entries is 0
then we can conclude that the game is over.
Now that you know list-based containers and dictionaries, can you add a recording feature to the number rearrange game. This should enable players to record their moves from the initial board positions. Being able to save and retrieve a saved game would be great. Remember that there can be several players starting from the same board.
Hint: You might want to use this structure to store and replay games for users:
List<Tuple<string, List<Tuple<int, int, int>>, List<string>>> games
The first string
in the Tuple
will represent the name of the player, the inner Tuple
one will represent the game board and the last List
of string
will represent the moves the player used to solve the puzzle or wherever he/she saved the game last.
Tuples have been around for a long time in functional programming languages. Recently, Microsoft research created a new programming language called F#, which has great support for Tuples. Although it is not related to Generics, it is a great thing to know how C# is borrowing from other languages and growing.
We learned a lot in this chapter about .NET Generics dictionary-based containers.
Specifically, we covered:
Dictionary<TKey,TValue>
SortedDictionary<TKey,TValue>
KeyValuePair<TKey,TValue>
Tuples
How to design dictionaries to support one-to-many mapping
We also discussed how to create a custom generic dictionary using the inbuilt generic containers. Now that we've learned about all the lists and dictionary-based containers, we're ready to learn more about how to query them better using LINQ. That's the topic of the next chapter.
18.226.187.194