<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Gio Carlo Cielo</title>
	<atom:link href="http://www.giocc.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.giocc.com</link>
	<description>a personal discourse</description>
	<lastBuildDate>Mon, 13 May 2013 01:32:54 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Concise Implementation of Minimax through Higher-Order Functions</title>
		<link>http://www.giocc.com/concise-implementation-of-minimax-through-higher-order-functions.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=concise-implementation-of-minimax-through-higher-order-functions</link>
		<comments>http://www.giocc.com/concise-implementation-of-minimax-through-higher-order-functions.html#comments</comments>
		<pubDate>Mon, 13 May 2013 01:27:44 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Ingenuity]]></category>
		<category><![CDATA[ai]]></category>
		<category><![CDATA[chess]]></category>
		<category><![CDATA[function]]></category>
		<category><![CDATA[higher-order]]></category>
		<category><![CDATA[minimax]]></category>
		<category><![CDATA[othello]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1562</guid>
		<description><![CDATA[The Minimax algorithm is the core of several game-playing AI for making decisions on the best move. This algorithm finds the best move for an AI on a two-player, combinatorial game state on games such as Checkers, Chess or Othello. In this post, I assume that the reader is familiar with the algorithm and its [...]]]></description>
			<content:encoded><![CDATA[<p>The Minimax algorithm is the core of several game-playing AI for making decisions on the best move. This algorithm finds the best move for an AI on a two-player, combinatorial game state on games such as Checkers, Chess or Othello.</p>
<p>In this post, I assume that the reader is familiar with the algorithm and its inherent code size due to its frequent implementation with three distinct functions. I will begin by briefly describing a standard implementation of Minimax and then I will introduce a concise implementation using higher-order functions. Note that we will use Python and Haskell as pseudo-code.</p>
<p><span id="more-1562"></span></p>
<h2>Two-Player, Combinatorial Game Representation</h2>
<p>We will begin with a simple description of two-player, combinatorial games abstractly. Games that can be categorized in this way include Checkers, Chess and Othello. However, we will focus on the abstraction of these games by representing their game state with a minimal interface necessary such that the Minimax algorithm can be applied to it. That is, the game should</p>
<ul>
<li>Provide a set of available moves given a specified game state,</li>
<li>Obtain the next state of a game given a current state and move</li>
<li>And determine whether the game is over.</li>
</ul>
<blockquote><div class="icon idea"></div>
<p>Although it may be more coherent to use Python pseudo-code, Python does not have a syntax for interfaces; consequently, we will outline the interface using Haskell&#8217;s syntax for a typeclass.</p></blockquote>
<pre><code class="haskell">class Game a
	get_available_moves :: a -> [Move]
	next_state :: a -> Move -> a
	is_gameover :: a -> Bool</code></pre>
<p>Now that we know methods associated with a particular game state, we may access them as a standard method call in Python: <code>game_state.get_available_moves()</code>, <code>game_state.next_state(move)</code> and <code>game_state.is_gameover()</code>.</p>
<h2>Standard Implementation of Minimax</h2>
<p>The standard implementation of the Minimax algorithm frequently includes three functions: <code>minimax(game_state)</code>, <code>min_play(game_state)</code> and <code>max_play(game_state)</code>. We will begin with the <code>minimax(game_state)</code> declaration. Note that I use Python here as working pseudo-code.</p>
<pre><code class="python">def minimax(game_state):
	moves = game_state.get_available_moves()
	best_move = moves[0]
	best_score = float('-inf')
	for move in moves:
		clone = game_state.next_state(move)
		score = min_play(clone)
		if score > best_score:
			best_move = move
			best_score = score
	return best_move
</code></pre>
<p>To summarize, Minimax is given a game state, obtains a set of valid moves from the game state, simulates all valid moves on clones of the game state, evaluates each game state which follows a valid move and finally returns the best move. </p>
<p>The following two helper functions simulate play between both the opposing player and the current player through the <code>min_play</code> and <code>max_play</code> procedures respectively. With the aid of these two helper functions, the entire game tree is traversed recursively given the current state of the game.</p>
<pre><code class="python">def min_play(game_state):
	if game_state.is_gameover():
		return eval(game_state)
	moves = game_state.get_available_moves()
	best_score = float('inf')
	for move in moves:
		clone = game_state.next_state(move)
		score = max_play(clone)
		if score < best_score:
			best_move = move
			best_score = score
	return best_score

def max_play(game_state):
	if game_state.is_gameover():
		return eval(game_state)
	moves = game_state.get_available_moves()
	best_score = float('-inf')
	for move in moves:
		clone = game_state.next_state(move)
		score = min_play(clone)
		if score > best_score:
			best_move = move
			best_score = score
	return best_score
</code></pre>
<p>In particular, the opponent intends to minimize the current player&#8217;s score and the current player intends to maximize their own score. Note that the helper functions short-circuit and return early if the game is over.</p>
<blockquote><div class="icon idea"></div>
<p>Notice that the scores are calculated through the <code>eval(game_state)</code> procedure. The implementation is omitted because it is dependent on the game itself; however, by convention, we say that the current player wins if the score is <code>INF</code> and loses if the score is <code>-INF</code>.</p></blockquote>
<p>There are 35 lines (sans the blank newlines) in the current implementation of our algorithm. We will then reduce the number of lines by  a factor of two using higher-order functions.</p>
<h2>Concise Implementation of Minimax</h2>
<p>It is intuitive that Minimax intends to find the maximum of a set of scores and a minimum of a set of scores for the current player and the opposing player respectively. Hence, it is intuitive to invest in the <code>max()</code> and <code>min()</code> procedures which function exactly as we need them to.</p>
<p>Let us begin by modifying the <code>min_play(game_state)</code> procedure.</p>
<p>First, the opposing player must check if the game is over and evaluate the game state if necessary.</p>
<pre><code class="python">def min_play(game_state):
	if game_state.is_gameover():
		return eval(game_state)</code></pre>
<p>Second, the opposing player wants to return the minimum score of all of game states following valid moves.</p>
<pre><code class="python">def min_play(game_state):
	if game_state.is_gameover():
		return eval(game_state)
	return min(scores) # Incomplete</code></pre>
<p>We know how to obtain the set of valid moves and we know how to obtain the next game state given a valid mode; however, we want to return the set the scores associated with the game states which follow valid moves. Subsequently, we must <code>map()</code> all of the game states which follow valid moves to a set of evaluations (or scores) that can be minimized.</p>
<pre><code class="python">def min_play(game_state):
	if game_state.is_gameover():
		return eval(game_state)
	return min(
		map(lambda move: max_play(game_state.next_state(move)),
			game_state.get_available_moves())</code></pre>
<p>This procedure is now complete; however, I will briefly overview a few key points of this procedure. We begin with a lambda as the function which takes a game state and returns the evaluations of the state. Furthermore, we <code>map()</code> the set of game states which follow from valid moves to a set of evaluations that can be minimized.</p>
<blockquote><div class="icon idea"></div>
<p>The higher-order <code>map(fn, list)</code> function applies a function, over a domain of type \(A\) to a codomain of type \(B\), to a list of type \(A\):</p>
<p><center>$$fn : A \mapsto B$$</center></p>
<p>In our example, we map a list of game states to a list of evaluations.</p></blockquote>
<p>Without loss of generality, the <code>max_play()</code> is similarly defined except that it uses the <code>max()</code> function for maximization. The further difficulty lies in the <code>minimax()</code> procedure which has the additional requirement of returning a valid move rather than a score alone.</p>
<p>To account for the additional requirement in the <code>minimax()</code> procedure, we modify the lambda of <code>map()</code> and the key of <code>max()</code> accordingly. That is, we begin by defining the lambda to be given a <code>move</code> and return a tuple containing <code>(move, score)</code>. Furthermore, now that we have a tuple, we must decide which parameter to minimize over. We define this the key using the <code>key</code> keyword argument of <code>max()</code>. The final procedure is defined below:</p>
<pre><code class="python">def minimax(game_state):
	return max(
		map(lambda move: (move, min_play(game_state.next_state(move))), 
			game_state.get_available_moves()), 
		key = lambda x: x[1])</code></pre>
<p>Now that all of the procedures have been redefined, we will see the final code:</p>
<pre><code class="python">def minimax(game_state):
	return max(
		map(lambda move: (move, min_play(game_state.next_state(move))), 
			game_state.get_available_moves()), 
		key = lambda x: x[1])

def min_play(game_state):
	if state.is_gameover():
		return .valuate(state)
	return min(
		map(lambda move: max_play(game_state.next_state(move)),
			game_state.get_available_moves()))

def max_play(game_state):
	if game_state.is_gameover():
		return evaluate(game_state)
	return max(
		map(lambda move: min_play(game_state.next_state(move)),
			game_state.get_available_moves()))</code></pre>
<p>The total line count (sans the blank newlines) is 17. Without expanding the higher-order functions across several lines, there are 10 lines. In both accounts, the number of lines of code has been reduced by at least two-fold. For a working algorithm on an implementation of a game, see my <a href="https://github.com/Hydrotoast/Hexapawn" title="Hexapawn">Hexapawn GitHub repository</a>.</p>
<h2>Conclusion</h2>
<p>It is easy to see that this implementation with higher-order functions is concise while maintaining readability of code (unlike our Perl friends in Code Golf). In general, several algorithms fit the problem that higher-order functions solve: composition of operations on larger data sets.</p>
<p>Focusing on the atomic components that an algorithm operates on such as elements of a set will not reduce the asymptotic lower-bound for the amount of code written. It is necessary that higher-level abstractions over such atomic elements exist to reduce the lower-bound for writing code. This notion has been explicated when the worst-case of sorting algorithms was reduced to less than \(O(nlgn)\) by using non-comparison-swap sorting algorithms. That is, sorting as simple comparison swap operations between two elements will never breach the \(O(nlgn)\) worst-case; however, Bucket sort has breached the worst-case by using operations outside of simple comparison and swap.</p>
<p>This particular implementation of Minimax could be reduced further into a single line of code using a fact that the Negamax variant of Minimax highlights for us:</p>
<p><center>$$max(a, b) = -min(-a, -b)$$</center></p>
<p>I leave this as an exercise for the ambitious.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/concise-implementation-of-minimax-through-higher-order-functions.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Designing a Linux Resource Manager in C++</title>
		<link>http://www.giocc.com/designing-a-linux-resource-manager-in-c.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=designing-a-linux-resource-manager-in-c</link>
		<comments>http://www.giocc.com/designing-a-linux-resource-manager-in-c.html#comments</comments>
		<pubDate>Fri, 22 Mar 2013 07:01:31 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Inspiration]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[crtp]]></category>
		<category><![CDATA[curiously recursive template pattern]]></category>
		<category><![CDATA[getrlimit]]></category>
		<category><![CDATA[getrusage]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[system call]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1543</guid>
		<description><![CDATA[By the fervor of Linus Torvalds, there does not exist any immediate C++ or OOP interfaces to operating system services. Consequently, it is sometimes necessary to wrap a logical set of Linux system calls in a C++ wrapper. In this post, I will demonstrate a standard process of wrapping the resource limitation and usage system [...]]]></description>
			<content:encoded><![CDATA[<p>By the fervor of Linus Torvalds, there does not exist any immediate C++ or OOP interfaces to operating system services. Consequently, it is sometimes necessary to wrap a logical set of Linux system calls in a C++ wrapper. In this post, I will demonstrate a standard process of wrapping the resource limitation and usage system calls in a <code>ResourceManager</code> singleton service while utilizing some nifty C++ tricks such as CRTP.</p>
<p><span id="more-1543"></span></p>
<h2>Motivation</h2>
<p>A procedural style offers the same behavior as an OOP-style would; however, OOP makes optimal use of abstractions by providing common design patterns. Consequently, refactoring will be made possible.</p>
<p>Because this tutorial primarily regards the design of the resource manager, I will not provide implementation details beyond the header declarations; however, the full source can be found on my <a href="https://github.com/Hydrotoast/ResourceManager" title="ResourceManager">GitHub repository</a>.</p>
<h2>Linux Resource Limitation and Usage Specifications</h2>
<p>Before designing a C++ wrapper, we must begin by noting some of the nuances in the Linux specification. In particular, the system calls associated with resource limitations and usage can be found in the Linux manual pages as <code>getrlimit()</code>, <code>setrlimit()</code> and <code>getrusage()</code>. The manual pages describe them as follows:</p>
<blockquote><div class="icon quote"></div>
<p>The <code>getrlimit()</code> and <code>setrlimit()</code> system calls get and set resource limits respectively.  Each resource has an associated soft and hard limit, as defined by the <code>rlimit</code> structure.</p>
<p><code>getrusage()</code> returns resource usage measures for <code>who</code>, which can be one of the following: <code>RUSAGE_SELF</code>, <code>RUSAGE_CHILDREN</code>, <code>RUSAGE_THREAD</code>. The resource usages are returned in the structure pointed to by <code>usage</code>.
</p></blockquote>
<p>Although these system calls are relatively straightforward, they also require a few constants and structures: <code>rlimit</code> and <code>usage</code>. The constants such as <code>RLIMIT_CPU</code> simply define the type of resource to limit; these constants can simply be referred to in the manual. The structures are defined as follows:</p>
<pre><code class="c">struct rlimit {
	rlim_t rlim_cur;  /* Soft limit */
	rlim_t rlim_max;  /* Hard limit (ceiling for rlim_cur) */
};

struct rusage {
	struct timeval ru_utime; /* user CPU time used */
	struct timeval ru_stime; /* system CPU time used */
	long   ru_maxrss;        /* maximum resident set size */
	long   ru_ixrss;         /* integral shared memory size */
	long   ru_idrss;         /* integral unshared data size */
	long   ru_isrss;         /* integral unshared stack size */
	long   ru_minflt;        /* page reclaims (soft page faults) */
	long   ru_majflt;        /* page faults (hard page faults) */
	long   ru_nswap;         /* swaps */
	long   ru_inblock;       /* block input operations */
	long   ru_oublock;       /* block output operations */
	long   ru_msgsnd;        /* IPC messages sent */
	long   ru_msgrcv;        /* IPC messages received */
	long   ru_nsignals;      /* signals received */
	long   ru_nvcsw;         /* voluntary context switches */
	long   ru_nivcsw;        /* involuntary context switches */
};

// necessary for struct rusage
struct timeval {
	time_t         tv_sec      /* seconds */
	suseconds_t    tv_usec     /* microseconds */
};
</code></pre>
<p>These structures can easily be transformed into classes. However, before we design the resource manager,  let us describe the difference in programming styles that we expect.</p>
<h2>Programming Styles</h2>
<p>These structures along with the aforementioned system calls entails a procedural style of resource management. Consider the following procedural code:</p>
<pre><code class="c">struct rlimit cpu;
cpu.rlim_cur = 3;
cpu.rlim_max = 3;
setrlimit(RLIMIT_CPU, &#038;cpu, NULL);</code></pre>
<p>This takes quite a few lines that is often easily reduced to a single line by an OOP style:</p>
<pre><code class="cpp">ResourceManager.set_resource_limit(ResourceLimit(RLIMIT_CPU, 3, 3));</code></pre>
<p>Although obviously longer as a single line, this does not logically conflict with an OOP programming paradigm. There is no performance optimization by changing the style to OOP; however, placing the code into a wrapper enables us to handle the code as a <em>cross-cutting concern</em> and therefore enable refactoring when the time comes. Recall that the primary benefit is making use of <em>design patterns</em> that OOP offers.</p>
<h2>The Structural Design</h2>
<p>Before we implement this wrapper, we must consider the high-level design. Allow us to begin with a driver program as our test case which will outline how we would like to use the interface. For simplicity, we will use <code>main</code> as our driver.</p>
<pre><code class="cpp">int main(int argc, char* argv[])
{
	// Acquire a single instance of the manager
	ResourceManager&#038; manager = ResourceManager::instance();

	// Set resource limits
	manager.set_resource_limit(ResourceLimit(RLIMIT_CPU, 3, 3));
	manager.set_resource_limit(ResourceLimit(RLIMIT_RTTIME, 3, 3));
	manager.apply_limits();

	// Do some computation?
	fib(31);

	// Obtain a resource usage report
	ResourceUsage usage = manager.get_resource_usage(RUSAGE_SELF);

	// Print the resource usage report
	cout << "User Time (sec): " << usage.utime().seconds << endl;
	cout << "User Time (usec): " << usage.utime().microseconds << endl;
	cout << "System Time (sec): " << usage.stime().seconds << endl;
	cout << "System Time (usec): " << usage.stime().microseconds << endl;

	return 0;
}</code></pre>
<p>Alright, so we have shown how we would like to use this interface. We will now consider the objects in use.</p>
<dl>
<dt><code>ResourceManager</code></dt>
<dd>Service that applies and tracks our resource limits.</dd>
<dt><code>ResourceLimit</code></dt>
<dd>Descriptor of a resource limit.</dd>
<dt><code>ResourceUsage</code></dt>
<dd>Descriptor of a resource usage report.</dd>
</dl>
<p>We must now consider a few nuances to our objects. First, we will look at the <code>ResourceManager</code>. In particular, it is only logical to have one, global instance of a <code>ResourceManager</code> because it applies to the entire process. Consequently, we should use the <em>Singleton</em> pattern. Further, because the manager tracks the currently applied resource limits and there exists a finite (and uniquely identifiable) set of resource limits, we will use a <code>std::set</code> data structure provided by STL.</p>
<blockquote><div class="icon idea"></div>
<p>The singleton design pattern ensures that only one instance of a class is instantiated throughout the lifetime of a program.</p></blockquote>
<p>Since <code>ResourceLimit</code> and <code>ResourceUsage</code> are simply descriptors, it is easy enough to implement them as straightforward classes. Operationally, <code>ResourceLimit</code> should be able to <code>apply()</code> itself. <code>ResourceUsage</code> should be immutable because its data is consistently changing relative to the speed of our processor; consequently, it is not easily observable.</p>
<p>Now that we have outlined the structure, we will begin implementing the entire component.</p>
<h2>Descriptor Design</h2>
<p>We will begin with the obvious, descriptor implementation. Specifically, we will begin with the <code>ResourceLimit</code> implementation. The header file should proceed as follows:</p>
<pre><code class="cpp">#include "sys/resource.h"

class ResourceLimit
{
public:
	// constructors
	ResourceLimit(int);
	ResourceLimit(int, rlim_t, rlim_t);
	ResourceLimit(int, rlimit&#038;);

	// accessors
	rlim_t soft_limit() const;
	rlim_t hard_limit() const;

	// native accessors
	const rlimit&#038; to_rlimit() const;

	// operational functions
	void get_limit();
	void apply();

	// comparisons
	bool operator==(const ResourceLimit&#038;) const;
	bool operator<(const ResourceLimit&#038;) const;
private:
	int resource_id_;
	rlimit limit_;

	bool applied_;
};</code></pre>
<p>As we can see, the necessary accessors are provided and the system-provided <code>rlimit</code> structure is hidden within the class. Further, the comparison operators are provided for when the class is used in the <code>ResourceManager</code>'s set. </p>
<p>Before we begin designing the <code>ResourceUsage</code> class, we must define a structure to replace the system-provided <code>timeval</code> structure.</p>
<pre><code class="cpp">/**
 * An alternative representation of timevals
 */
class TimeParts
{
public:
	long seconds;
	long microseconds;

	// constructors
	TimeParts(long seconds, long microseconds) :
		seconds(seconds), microseconds(microseconds) {}

	TimeParts(timeval tv) :
		seconds(tv.tv_sec), microseconds(tv.tv_usec) {}
};</code></pre>
<p>Now that we have redefined a class for <code>timeval</code>, we begin the <code>ResourceUsage</code> class which is, of course, <em>immutable</em>.</p>
<blockquote><div class="icon idea"></div>
<p>Immutable objects are non-modifiable after creation. They are necessary for implementing systems in which side effects are circumvented.</p></blockquote>
<pre><code class="cpp">#include "sys/resource.h"
#include "sys/time.h"

/**
 * A descriptor of the resource usage of the
 * current process. This object is immutable
 * after construction.
 */
class ResourceUsage
{
public:
	// constructors
	ResourceUsage(int);
	
	// time accessors
	TimeParts utime() const;
	TimeParts stime() const;

	// accessors
	long maxrss() const;
	long ixrss() const;
	long idrss() const;
	long isrss() const;
	long minflt() const;
	long majflt() const;
	long nswap() const;
	long inblock() const;
	long oublock() const;
	long msgsnd() const;
	long msgrcv() const;
	long nsignals() const;
	long nvcsw() const;
	long nivcsw() const;
private:
	int who_;

	rusage usage_;
};</code></pre>
<p>This descriptor also directly models the structure provided by the system; however, note that this design enables absolute immutability. With C-style structures, the data fields would be modifiable which is absolutely unnecessary.</p>
<p>Now that we have designed the descriptors, we can proceed with the design of the manager.</p>
<h2>Resource Manager Design</h2>
<p>We begin by noting that that the <code>ResourceManager</code> must obviously have access to the <code>ResourceLimit</code> and <code>ResourceUsage</code> descriptors and begin by including their headers into the source. Furthermore, we include the necessary resource management headers provided by the system and the set data structure provided by STL as we have intended.</p>
<pre><code class="cpp">#include "ResourceLimit.h"
#include "ResourceUsage.h"

#include "Singleton.h"

#include "sys/resource.h"
#include "sys/time.h"

#include &lt;set&gt;</code></pre>
<p>Next, we design the manager to provide the same operational functionality that the Linux system-calls do; however, we design it in the context of sets of resource limits.</p>
<pre><code class="cpp">class ResourceManager
{
public:
	// mutators
	bool set_resource_limit(const ResourceLimit&#038;);

	// accessors
	ResourceLimit get_resource_limit(int) const;
	ResourceUsage get_resource_usage(int) const;

	// operational functions
	void apply_limits();
private:
	std::set&lt;ResourceLimit&gt; resource_limits_;
};</code></pre>
<p>This manager enables us to set defer resource limit application until necessary. Additionally, because we use a set data structure, no duplicate resource limits will overwrite each other. Now, we are only missing a single feature: the singleton.</p>
<h3>Singleton Application</h3>
<p>Although we can use the standard approach by manually embedding the operational functionality of a singleton into our <code>ResourceManager</code>, we can use a modern C++ idiom, the <em>curiously recursive template pattern</em> (CRTP). Specifically, the singleton will have a template parameter which will be our derived class. The singleton will then inject the singleton operational functionality into the derived class.</p>
<pre><code class="cpp">/**
 * Guarantees that only a single instance of an object will exist
 * throughout the lifetime of the program.
 */
template &lt;class Derived&gt;
class Singleton
{
public:
	Singleton(const Singleton&#038;) = delete;
	Singleton&#038; operator=(const Singleton&#038;) = delete;

	static Derived&#038; instance()
	{
		if (instance_ == nullptr)
			instance_ = new Derived();
		return *instance_;
	}
protected:
	Singleton() {}
	static Derived* instance_;
};

template &lt;class Derived&gt;
Derived* Singleton<Derived>::instance_ = nullptr;</code></pre>
<p>Here, notice that the singleton also uses some C++11 features where we delete the copy constructor and the copy assignment operator. Furthermore, notice that singleton operational functionality can now simply be inherited.</p>
<p>Finally, we can then inherit from our singleton,</p>
<pre><code class="cpp">class ResourceManager : public Singleton&lt;ResourceManager&gt;
{
public:
	// mutators
	bool set_resource_limit(const ResourceLimit&#038;);

	// accessors
	ResourceLimit get_resource_limit(int) const;
	ResourceUsage get_resource_usage(int) const;

	// operational functions
	void apply_limits();
private:
	std::set&lt;ResourceLimit&gt; resource_limits_;
};</code></pre>
<p>Note that it will be even easier to implement further singleton functionality if necessary; however, this component no further singletons.</p>
<p>At this point, our design is complete! Do not forget that the full source code is available in my <a href="https://github.com/Hydrotoast/ResourceManager" title="ResourceManager">GitHub repository</a>.</p>
<h2>Conclusion</h2>
<p>It is easy to see that there was much duplication in the descriptor classes; however, it is a necessary evil to provide further constraints that classes provide such as immutability. Furthermore, use of the constructor enables quick initialization of objects as we have seen the driver.</p>
<p>Furthermore, utilizing OOP in C++ enables use of design patterns such as the singleton through CRTP. This allows us to design the resource manager as a cross-cutting concern and think of it as a logical abstraction rather than as an implementation detail.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/designing-a-linux-resource-manager-in-c.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>MinDispatch: Event-Driven Framework In Java Part 2</title>
		<link>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-2.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mindispatch-event-driven-framework-in-java-part-2</link>
		<comments>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-2.html#comments</comments>
		<pubDate>Mon, 04 Mar 2013 09:14:49 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Ingenuity]]></category>
		<category><![CDATA[chat simulation]]></category>
		<category><![CDATA[douglas schmidt]]></category>
		<category><![CDATA[event queue]]></category>
		<category><![CDATA[event-driven]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[reactor pattern]]></category>
		<category><![CDATA[separation of concerns]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1462</guid>
		<description><![CDATA[From the last post in this series, we developed a fixed, event-driven chat simulation. In this post, we will extend this example by refactoring. The objective of this tutorial is to teach effective design patterns in an event-driven model. First we will begin by designing the structure and behavior of the user and chat to [...]]]></description>
			<content:encoded><![CDATA[<p>From the last post in this series, we developed a fixed, event-driven chat simulation. In this post, we will extend this example by refactoring. The objective of this tutorial is to teach effective design patterns in an event-driven model. First we will begin by designing the structure and behavior of the user and chat to describe our application. Second, we will bind the aforementioned chat state to the event handlers to fix constant parameters. Finally, we will use an event queue for separation of concerns.</p>
<p><span id="more-1462"></span></p>
<h3>Continuation</h3>
<p>We will continue with our previous example which you can find on <a href="https://github.com/Hydrotoast/MinDispatch" title="MinDispatch">GitHub</a>. It will be easier to follow along with the source code.</p>
<p>Beware that I will no longer provide a full example of the code. It is expected that you are aware of where the highlighted code modifications occur because the code file has grown large.</p>
<h2>Structural Design</h2>
<p>We begin by specifying the structural requirements of our chat. That is, we must specify the fields associated with our two primary data structures: <code>User</code> and <code>ChatState</code>.</p>
<h3>User State Structure</h3>
<p>We begin with the atomic data structure, the <code>User</code>. A user in a chat has a name.</p>
<pre><code class="java">private static class User {
	public String name;
	public User(String name) {
		this.name = name;
	}
}</code></pre>
<p>Simple enough.</p>
<h3>Chat State Structure</h3>
<p>Next,the <code>ChatState</code> must model the configuration of a chat room: it should maintain the list of users currently in the chat.</p>
<pre><code class="java">private static class ChatState {
	private ArrayList&lt;User&gt; users;

	public ChatState() {}
}</code></pre>
<p>Now to move on to the behaviors of our design.</p>
<h2>Behavioral Design</h2>
<p>The structure of our design has been outlined. Now we must enable interactions between these structures relative the possible events that may occur. Recall the events of a chat application:</p>
<blockquote><div class="icon quote"></div>
<dl>
<dt>User Arrival</dt>
<dd>Occurs when a user arrives to a room.</dd>
<dt>User Departure</dt>
<dd>Occurs when a user departs from a room.</dd>
<dt>User Message</dt>
<dd>Occurs when a user sends a message to a room.</dd>
</dl>
</blockquote>
<p>Both the <code>Chat</code> and <code>User</code> must respond to these events accordingly. The following two sections will describe their behavioral implementation respectively.</p>
<h3>Chat State Behaviors</h3>
<p>Beginning with the <code>ChatState</code> this time, we must handle each of the aforementioned events. Specifically, we must support the following operations:</p>
<ul>
<li>the addition of users to the chat,</li>
<li>the removal of users from the chat,</li>
<li>and broadcasting messages to all users currently in the chat.</li>
</ul>
<p>Broadcasting must be handled in a special manner. For broadcasting, we must specify our recipients: all users should receive a copy of an event dispatched. Since we have outlined the design, consider the following interface which expresses our intent,</p>
<pre><code class="java">private static class ChatState {
	private ArrayList&lt;User&gt; users;

	public ChatState() {}

	public void broadcast(Event evt) {}

	// Mutators
	public void addUser(User user) {}
	public void removeUser(User user) {}
}</code></pre>
<p>The implementation is relatively straightforward: the <code>users</code> list maintains the current users in the chat and events can be broadcasted to subsequent users.</p>
<pre><code class="java">private static class ChatState {
	private ArrayList<User> users;

	public ChatState() {
		this.users = new ArrayList<User>();
	}

	public void broadcast(Event evt) {
		for (User recipient : users)
			recipient.dispatch(evt);
	}

	// Mutators
	public void addUser(User user) {
		users.add(user);
	}

	public void removeUser(User user) {
		users.remove(user);
	}
}</code></pre>
<p>The <code>ChatState</code> is responsible for broadcasting events to each of the registered users. So, the <code>ChatState</code> must broadcast the message to individual users so that they may handle the messages individually. See the diagram for more information.</p>
<div id="attachment_1517" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/ChatStateBroadcasting.png"><img src="http://www.giocc.com/wp-content/uploads/ChatStateBroadcasting.png" alt="Event broadcasting" title="Event broadcasting" width="360" height="449" class="size-full wp-image-1517" /></a><p class="wp-caption-text">Event broadcasting</p></div>
<p>When we construct the event queue, the <code>ChatState</code> will act as an event forwarding mechanism for <code>UserMessage</code> events.</p>
<p>Beware that there is a dependency the above code. The <code>broadcast</code> method dispatches events to users by calling the <code>User.dispatch</code> method which doesn&#8217;t exist yet. So, let us continue onto the <code>User</code> behaviors.</p>
<h3>User State Behaviors</h3>
<p>We will outline the behavioral implementation of our <code>User</code> class now. Since the user is capable of receiving events, we should <em>demultiplex</em> the incoming events and handle them appropriately. Specifically, we want to know if a user received a message. Consider the implementation then:</p>
<pre><code class="java">private static class User {
	public String name;

	public User(String name) {
		this.name = name;
	}

	// Event demultiplexing
	public void dispatch(Event evt) {
		if (evt.getClass() == UserMessage.class) {
			UserMessage message = (UserMessage) evt;
			processMessage(message.user, message.message);
		}
	}

	// Event processing
	public void processMessage(User user, String userMessage) {
		// Ignore messages by me
		if (user.equals(this))
			return;
		System.out.println(
			name + " received message from " + 
			user.name
		);
	}
}</code></pre>
<p>Take a look at the <code>broadcast</code> method. The type of the event argument is compared against <code>UserMessage.class</code>. This if-ladder is an example of event demultiplexing.</p>
<div id="attachment_1499" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/ChatStateDemultiplexing.png"><img src="http://www.giocc.com/wp-content/uploads/ChatStateDemultiplexing.png" alt="Event demultiplexing" title="Event demultiplexing" width="360" height="227" class="size-full wp-image-1499" /></a><p class="wp-caption-text">Event demultiplexing</p></div>
<blockquote><div class="icon idea"></div>
<p>Event demultiplexing occurs when a stream of events split its channels thereby processing the events individually rather than as a stream. This may warrant a need for an event dispatcher within a <code>User</code>.</p></blockquote>
<p>When demultiplexing events, we route events to their respective handlers. Specifically, we route all of the <code>UserMessage</code> events to the <code>processMessage</code> handler and ignore the rest (arrival and departure are ignored). Once the events have been handled after demultiplexing, the behaviors of the data structure are complete.</p>
<h2>Binding Chat State to Event Handlers</h2>
<p>Unfortunately, now that a <code>ChatState</code> exists, we must pass the object, as a parameter, to the each of the event handlers so that they may change the state of the object. Consider the event handler setup for <code>UserArrival</code></p>
<pre><code class="java">state.registerChannel(UserArrival.class, new ChatHandler() {
	@Override
	public void dispatch(Event evt) {
		UserArrival arrival = (UserArrival) evt;
		arrival.state.addUser(arrival.user);

		System.out.println(
			arrival.user.name + " has entered the room."
		);
	}
});</code></pre>
<p>Passing state along with each event may cause significant code duplication as well as unnecessary runtime overhead. With the current design of event handlers, each of the previously designed handlers to the user events, <code>UserArrival</code>, <code>UserDeparture</code> and <code>UserMessage</code> must store a reference to the <code>ChatState</code> that they operate on. </p>
<p>There exists a solution which removes the code duplication and the runtime overhead. We can push the responsibility of maintaining state to the event handlers by binding the <code>ChatState</code> to a custom event handler. We know that this is feasible because <code>ChatState</code> is the same throughout the execution of this simulation.</p>
<h3>Application-specific Chat Handlers</h3>
<p>We will implement our own event handlers, <code>ChatHandler</code>, specifically for handling chat-specific events on a <code>ChatState</code>. Simply, this custom handler should fix the parameter common to all of our handlers, <code>ChatState</code>.</p>
<pre><code class="java">private static class ChatHandler extends Handler {
	protected ChatState state;
	public ChatHandler(ChatState state) {
		this.state = state;
	}
}</code></pre>
<p>Afterwards, we may access the state of the chat for each subsequent <code>ChatHandler</code>. So, the handler registration will be slightly different with an inherited handler.</p>
<pre><code class="java">public static void registerHandlers(EventDispatcher dispatcher, ChatState state) {
	dispatcher.registerChannel(UserArrival.class, new ChatHandler(state) {
		@Override
		public void dispatch(Event evt) {
			UserArrival arrival = (UserArrival) evt;
			state.addUser(arrival.user);

			System.out.println(
				arrival.user.name + &quot; has entered the room.&quot;
			);
		}
	});

	dispatcher.registerChannel(UserDeparture.class, new ChatHandler(state) {
		@Override
		public void dispatch(Event evt) {
			UserDeparture departure = (UserDeparture) evt;
			state.removeUser(departure.user);

			System.out.println(
				departure.user.name + &quot; has left the room.&quot;
			);
		}
	});

	dispatcher.registerChannel(UserMessage.class, new ChatHandler(state) {
		@Override
		public void dispatch(Event evt) {
			UserMessage message = (UserMessage) evt;
			String userMessage = 
				String.format(
					&quot;%s: %s&quot;, 
					message.user.name,
					message.message
				);
			System.out.println(userMessage);

			// Broadcast messages
			state.broadcast(message);
		}
	});
}</code></pre>
<blockquote><div class="icon idea"></div>
<p> In our new implementation, we will use a <code>registerHandlers</code> helper function to initialize our event handlers with a specified event dispatcher and <code>ChatState</code>.</p></blockquote>
<p>The state of the chat is updated in the above event handlers using the behavioral design that we have previously specified. Hence, we have effectively decoupled the state of the chat from the event dispatching.</p>
<h2>Using an Event Queue</h2>
<p>Next, we will utilize an event queue to <em>separate concerns</em>.</p>
<blockquote><div class="icon quote"></div>
<p>In computer science, <em>separation of concerns</em> (SoC) is the process of breaking a computer program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program.</p></blockquote>
<p>The event queue will enable us to separate the event dispatcher from the application-specific users and the chat state. That is, users should be unaware of the existence of a dispatcher especially when generating events themselves. </p>
<blockquote><div class="icon idea"></div>
<p>Additionally, when we introduce concurrency into this application (hint), the queue will also serve as a shared buffer between distributed users and the server which decouples application-independent concurrency mechanisms from our application-specific method functionality.</p></blockquote>
<p>Conceptually, the event queue acts as a multiplexed channel which interleaves events from individual users since there is no particular order in which users may send messages. The event queue&#8217;s only concern is event multiplexing.</p>
<div id="attachment_1498" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/ChatStateMultiplexing.png"><img src="http://www.giocc.com/wp-content/uploads/ChatStateMultiplexing.png" alt="Event multiplexing" title="Event multiplexing" width="360" height="449" class="size-full wp-image-1498" /></a><p class="wp-caption-text">Event multiplexing</p></div>
<p>For now, we will simply use a simple <code>Queue&lt;Event&gt;</code> in the Java standard library to express our intent. So, to instantiate this, we use a <code>java.util.LinkedList</code>.</p>
<pre><code class="java">import java.util.LinkedList;

// ChatState declaration here

public static void main(String[] args) {
	EventDispatcher dispatcher = new Dispatcher();
	ChatState state = new ChatState();
	Queue&lt;Event&gt; eventQueue = new LinkedList&lt;Event&gt;();

	// Further simulation code such as event handler registration
}</code></pre>
<h3>Integration with Dispatcher</h3>
<p>Since the <code>Dispatcher</code> is responsible for dispatching events, we should dispatch all of the events in queue when flushing the buffer.</p>
<pre><code class="java">import java.util.LinkedList;

// ChatState declaration here

public static void main(String[] args) {
	EventDispatcher dispatcher = new Dispatcher();
	ChatState state = new ChatState();
	Queue&lt;Event&gt; eventQueue = new LinkedList&lt;Event&gt;();

	// Further simulation code such as event handler registration
	// Possibly generate events beforehand

	// Dispatch all queued events
	while (!eventQueue.isEmpty()) {
		Event evt = eventQueue.remove();
		dispatcher.dispatch(evt);
	}
}</code></pre>
<p>Furthermore, notice that the event queue does not interact with the <code>ChatState</code>. This is a highlight of separated concerns because event queues are application-independent.</p>
<h3>Integration with Users</h3>
<p>Before dispatching events with users, we must connect users to the event queue. Simply, we enable each individual user to reference the event queue in the implementation.</p>
<pre><code class="java">private static class User {
	public Queue&lt;Event&gt; eventQueue;
	public String name;

	public User(Queue&lt;Event&gt; eventQueue, String name) {
		this.eventQueue = eventQueue;
		this.name = name;
	}

	// Behavioral methods
}</code></pre>
<p>Once users have a reference to the event queue, they are able to generate events. Specifically, we want to enable users to send messages to the chat, thereby sending a message to all other users currently in the chat.</p>
<pre><code class="java">private static class User {
	public Queue&lt;Event&gt; eventQueue;
	public String name;

	public User(Queue&lt;Event&gt; eventQueue, String name) {
		this.eventQueue = eventQueue;
		this.name = name;
	}

	// Event demultiplexing and handling methods

	// Event generation
	public void sendMessage(String message) {
		eventQueue.add(new UserMessage(this, message));
	}
}</code></pre>
<p>Thus, users are now capable of sending messages without being aware of the event dispatcher and the chat state. Effectively, this is a highlight of <em>modularity</em> where modifications to a user&#8217;s capability in the system is independent of the modifications to the chat state and event dispatcher.</p>
<h2>Testing the Simulation</h2>
<p>Finally, the final source should be similar to my source code on <a href="https://github.com/Hydrotoast/MinDispatch/blob/master/edu/giocc/EventMachine/StatefulChatEventMachine.java">GitHub</a>. Now, we can test the simulation using the following <code>main</code> and hardcoded events:</p>
<pre><code class="java">public static void main(String[] args) {
	EventDispatcher dispatcher = new EventDispatcher();
	ChatState state = new ChatState();
	Queue&lt;Event&gt; eventQueue = new LinkedList&lt;Event&gt;();

	registerHandlers(dispatcher, state);

	// Initialize users
	User foo = new User(eventQueue, &quot;foo&quot;);
	User bar = new User(eventQueue, &quot;bar&quot;);
	dispatcher.dispatch(new UserArrival(foo));
	dispatcher.dispatch(new UserArrival(bar));

	// Enqueue events from individual users
	foo.sendMessage(&quot;hello, bar!&quot;);
	bar.sendMessage(&quot;hello, foo!&quot;);
	foo.sendMessage(&quot;goodbye, bar!&quot;);

	// Dispatch all queued events
	while (!eventQueue.isEmpty()) {
		Event evt = eventQueue.remove();
		dispatcher.dispatch(evt);
	}

	// Finish up simulation
	dispatcher.dispatch(new UserDeparture(foo));
	dispatcher.dispatch(new UserDeparture(bar));
}</code></pre>
<p>The following output should be produced:</p>
<pre><code class="no-highlight">foo has entered the room.
bar has entered the room.
foo: hello, bar!
bar: hello, foo!
foo: goodbye, bar!
foo has left the room.
bar has left the room.</code></pre>
<p>Thus, our chat simulation is complete.</p>
<h2>Conclusion</h2>
<p>Effective application design coupled with an event queue makes modification of the code far easier simply because we have a separation of concerns and modularity. That is, modifications to our application-specific handlers or data structures are independent of modifications to the application-independent event-driven framework, <a href="https://github.com/Hydrotoast/MinDispatch/tree/master/edu/giocc/MinDispatch" title="MinDispatch">MinDispatch framework on GitHub</a>..</p>
<p>It is easy to see that using the MinDispatch framework significantly simplifies the design for an event-driven application by handling the application-independent work.</p>
<h3>Further Reading</h3>
<p>I recommend reading Douglas Schmidt&#8217;s collection of papers on event handling and concurrency. Specifically, the <a href="http://www.cs.wustl.edu/~schmidt/PDF/reactor-siemens.pdf">Reactor Pattern</a> has significantly influenced the design of my framework.</p>
<h3>Another Continuation</h3>
<p>There are two paths we can take from here:</p>
<ol>
<li>Enabling concurrency and distributed computing</li>
<li>Enabling dynamic user and input and developing a chat AI</li>
</ol>
<p>Decide. Comment your preference below.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-2.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Partioning Discussion Sections for Lecture-Hall Sized Classes</title>
		<link>http://www.giocc.com/partioning-discussion-sections-for-lecture-hall-sized-classes.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=partioning-discussion-sections-for-lecture-hall-sized-classes</link>
		<comments>http://www.giocc.com/partioning-discussion-sections-for-lecture-hall-sized-classes.html#comments</comments>
		<pubDate>Thu, 21 Feb 2013 05:00:30 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Ingenuity]]></category>
		<category><![CDATA[dynamic programming]]></category>
		<category><![CDATA[np complete]]></category>
		<category><![CDATA[partioning problem]]></category>
		<category><![CDATA[power set]]></category>
		<category><![CDATA[university]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1332</guid>
		<description><![CDATA[Eric Hennigan had recently pitched a new partitioning problem to ACM: partitioning his discussion sections among two TAs such that students are equally distributed to each TA. Although the problem may be trivial to do by hand, it&#8217;s easy to decompose into discrete mathematics, and therefore, easy to analyze. Before reading further, beware: this post [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cogitolingua.net/blog/">Eric Hennigan</a> had recently pitched a new partitioning problem to ACM: partitioning his discussion sections among two TAs such that students are equally distributed to each TA. Although the problem may be trivial to do by hand, it&#8217;s easy to decompose into discrete mathematics, and therefore, easy to analyze.</p>
<p><span id="more-1332"></span></p>
<p>Before reading further, beware: this post is highly dependent on knowledge of discrete mathematics (most importantly, sets). Although the concept is simple, please refer to an authoritative encyclopedia such as <em>Wolfram Mathworld</em> regarding unfamiliar ideas. I will highlight such esoteric ideas with <em>emphasis</em>.</p>
<h2>Discussion Section Partioning Problem</h2>
<p>Since we can rewrite this problem in terms of discrete mathematics, we will begin by decomposing the problem into sets, the fundamental structure for homogenous data.</p>
<p>You are given a set, \(S\), of \(n\) discussion sections where each element, \(x_i\), represents the number of students in section number \(i\). Create two partitions \(A\) and \(B\) such that the sum of their elements is minimized.</p>
<p>As an example, Let \(S\) be</p>
<p>\[\{ 3, 4, 5, 6, 8 \}\]</p>
<p>which can be a set of five discussion sections with three, four, five, six and eight students respectively. By staring at this list for a while, it becomes ostensible that we should split the set into</p>
<p>\[A = \{5, 8\}, B = \{3, 4, 6\}\]</p>
<p>such that their sums are equal. In regards to the problem domain, TA \(A\) receives the discussion sections with five and eight students while TA \(B\) receives the <em>complementary</em> sections. Consequently, the difference of the sums of their elements are minimized.</p>
<h3>Decomposition into Optimization</h3>
<p>To define this as an optimization problem, we would like to select a subset of <strong>S</strong> in the <em>power set</em> of \(S\) that minimizes difference in the sum of the subset&#8217;s elements and the sum of its <em>complement</em>&#8216;s elements.</p>
<p>\[<br />
\min_{X \in \mathcal{P}(S)}\left\{\left|\sum_{x \in X}^{|X|} x - \sum_{x \in S - X}^{|S-X|} x\right|\right\}<br />
\]</p>
<h3>Brute Force Solution</h3>
<p>Intuitively, we can simply iterate over all subsets of \(S\) and try all possibilities in</p>
<p>\[<br />
O(2^n)<br />
\]</p>
<p>where \(n\) is the number of elements of the set.</p>
<blockquote><div class="icon idea"></div>
<p>This proof is trivial because there are <strong>2</strong><sup>n</sup> subsets for any set including the empty set (see the <a href="http://mathworld.wolfram.com/PowerSet.html">cardinality of the power set</a>). Of course, this also means that the algorithm would run in exponential-time which is ridiculously slow (for arbitrarily large sets).</p></blockquote>
<p>We can attempt to do better.</p>
<h2>Dynamic Programming Solution</h2>
<p>First, we will take a look at how the we can break up this problem into subproblems. Then, we will see how these subproblems are related and exploit the relationship to develop an efficient solution. Finally, I will provide an example (in case you&#8217;ve become lost in the theory).</p>
<blockquote><div class="icon idea"></div>
<p>I have written about Dynamic Programming before:<br />
See my dynamic programming solution for the <a href="http://www.giocc.com/a-dynamic-programming-solution-to-the-josephus-problem.html">Josephus Problem</a> if you haven&#8217;t already.</p></blockquote>
<h3>Subproblem Decomposition</h3>
<p>We can begin by reformulating this problem into another familiar problem: <em>the knapsack problem</em> which seeks to find the subset of a set of items with the maximum value given a bag of finite size. To create a mapping from our problem to knapsack, we must think of this backwards.</p>
<p>We must let let the sum of the elements of either partition be equivalent to the bag of finite size. Next, we must determine the maximum bag capacity (or the maximum sum) in our case. Recall that difference is absolutely minimum when the sum of the elements in the partitions are equal. That is, when a partition&#8217;s sum of elements is equal to exactly half the sum of the elements of the original set:</p>
<p>\[<br />
\frac{1}{2} \sum_{x \in S} x<br />
\]</p>
<p>Subsequently, at each bag size, we can determine the set of elements that are closest to the desired sum.</p>
<p>Of course, with every dynamic programming solution, we must define boundary cases. Thus, we let \(V[i]\) be the solution for a sum of \(i\). The maximum sum, \(i\), should be within the range:</p>
<p>\[<br />
\min_{x \in S}{x} \leq i \leq \frac{1}{2} \sum_{x \in S} x<br />
\]</p>
<p>It is trivial then that the maximum sum for the lowerbound is equal to itself:</p>
<p>\[<br />
V[\min_{x \in S}{x}] = \min_{x \in S}{x}<br />
\]</p>
<p>From here, we define our <em>recurrence relationship</em> and realize that the maximum sum for a current solution, \(i\) is an element from the original set, \(x\) (that has not yet been selected), summed with a previous solution \(V[i-x]\):</p>
<p>\[<br />
V[i] = \max_{x \in S}\{x + V[i - x] : x \not\in V[i-x] \}<br />
\]</p>
<p>Thus, the best fit sum of a partition lay in \(V[max\_sum]\) where \(V[max\_sum]\) is half sum of elements of the original set.</p>
<h3>Example Application</h3>
<p>In the case I have lost you in theory, here is the promised example using the previously used set.</p>
<p>\[<br />
\{ 3, 4, 5, 6, 8 \}<br />
\]</p>
<p>\(max\_sum\) should be</p>
<p>\[<br />
\frac{1}{2} \sum_{x \in S} x = \frac{1}{2} (3 + 4 + 5 + 6 + 8) = 13<br />
\]</p>
<p>We apply dynamic programming between the range of three and thirteen. Starting from the bottom, we will begin applying our recurrence relationship:</p>
<p>\[<br />
V[3] = \{3\} \\<br />
V[4] = \{4\} \\<br />
V[5] = \{5\} \\<br />
V[6] = \{6\}<br />
\]</p>
<p>There are no elements better than themselves up to six; subsequently, their solutions are equivalent to the set of themselves.</p>
<p>\[<br />
V[7] = \{3\} \cup V[4] = \{3, 4\}<br />
\]</p>
<p>The next subproblem has a solution which fits with with no duplicates.</p>
<p>\begin{aligned}<br />
V[8] &#038;= \{8\} \\<br />
V[9] &#038;= \{3\} \cup V[6] = \{3, 6\} \\<br />
V[10] &#038;= \{4\} \cup V[6] = \{4, 6\} \\<br />
V[11] &#038;= \{3\} \cup V[8] = \{3, 8\}<br />
\end{aligned}</p>
<p>The above subproblems continue normally until \(V[12]\) is reached.</p>
<p>\[<br />
V[12] = \{4\} \cup V[8] = \{4, 8\}<br />
\]</p>
<p>Although \(\{3, V[9]\}\) comes before \(\{4, V[8]\}\) <em>lexicographically</em>, it does not work because \(3\) would be a duplicate in \(V[9]\)&#8217;s solution set.</p>
<p>\[<br />
V[13] = \{3\} \cup V[10] = \{3, 4, 6\}<br />
\]</p>
<p>Finally, we acquire one of two partitions for the solution; the other can be inferred as the complement of this. That is, the other can be calculated by</p>
<p>\[<br />
B = S - V[13]<br />
\]</p>
<p>Consequently, we have the partitions \(V[13]\) and \(B\) which are the solutions to our problem. That is, if we refer back to the context, TA \(A\) should be assigned discussion sections with three, four and six students and TA \(B\) should be assigned discussion sections with five and eight students. Because both TAs have thirteen students each, their partitions have the property of minimum difference.</p>
<h3>Analysis of Dynamic Programming Solution</h3>
<p>Recall that the number of iterations is equivalent to half of the sum, </p>
<p>\[<br />
O\left(\frac{1}{2}\sum{x \in S} x\right)<br />
\]</p>
<p>Each iteration is a subproblem that finds an element in the set which best fits a previous subproblem solution. The time-complexity of each subproblem is thus</p>
<p>\[<br />
O(n)<br />
\]</p>
<p>which can be implemented with set operations. Recall that the solution must have distinct objects and therefore each element must be checked if they already exist in the solution i.e. <code>find()</code> before the element can be added to the solution set, <code>union()</code>.</p>
<p>Hash tables can be used for constant-time operations; however, it can also be memory-intensive. A better solution could have amortized constant-time operations with linear space.</p>
<blockquote><div class="icon idea"></div>
<p>The Union-Find data structure requires that at least one operations of <code>find()</code> and <code>union()</code> must have a logarithmic running time; the latter operation can be constant-time. However, Disjoint Set Forests enable the <code>find()</code> operation to be constant-time, amortized using <em>path compression</em>.</p></blockquote>
<p>By multiplying the number of iterations with the cost of each iteration, we achieve the final running time:</p>
<p>\[<br />
O\left(\frac{1}{2}(\sum_{x \in S} x)n\right)<br />
\]</p>
<p>Thus, I have demonstrated.</p>
<h2>Conclusions</h2>
<p>Although a dynamic programming solution to this problem exists, it may sometimes be infeasible for arbitrarily large values of the half-sum. Consider the canonical example above. The running time of the dynamic programming solution is 65 time units whereas the power-set solution requires only 32 time units.</p>
<p>Making the decision of selecting either the dynamic programming solution or the power-set solution boils down to solving</p>
<p>\[<br />
\min\left(\frac{1}{2}(\sum_{x \in S} x)n, 2^n\right)<br />
\]</p>
<p>Overall, this is a great problem to practice algorithm design and analysis because it can be easily decomposed into discrete mathematics. There may be better solutions to this problem, and if you find one, please inform me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/partioning-discussion-sections-for-lecture-hall-sized-classes.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Projects Matching Problem of ICS Clubs and Small Organizations</title>
		<link>http://www.giocc.com/projects-matching-problem-of-ics-clubs-and-small-organizations.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=projects-matching-problem-of-ics-clubs-and-small-organizations</link>
		<comments>http://www.giocc.com/projects-matching-problem-of-ics-clubs-and-small-organizations.html#comments</comments>
		<pubDate>Tue, 09 Oct 2012 09:20:39 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Innovation]]></category>
		<category><![CDATA[acm]]></category>
		<category><![CDATA[ics]]></category>
		<category><![CDATA[organization]]></category>
		<category><![CDATA[problem]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[strategy]]></category>
		<category><![CDATA[uci]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1247</guid>
		<description><![CDATA[On the domain of all people seeking to get involved with projects and all people seeking talent for projects, there exists what I prefer to describe as the Projects Problem. This problem can be seen as a variation of the Stable Marriage Problem or Assignment Problem. The Projects Problem is specific to effectively matchmaking among [...]]]></description>
			<content:encoded><![CDATA[<p>On the domain of all people seeking to get involved with projects and all people seeking talent for projects, there exists what I prefer to describe as the <strong>Projects Problem</strong>. This problem can be seen as a variation of the Stable Marriage Problem or Assignment Problem. The Projects Problem is specific to effectively matchmaking among idea people and feasible developers for resource-limited organizations such as small school organizations.</p>
<p><span id="more-1247"></span></p>
<h2>Preliminary Definitions</h2>
<p>In order to understand the projects problem, we must first describe the context in which it occurs. The people involved in this problem are of two distinct sets:</p>
<dl>
<dt>Idea People</dt>
<dd>People that propose an idea or have an explicit problem/project that they require help with</dd>
<dt>Developers</dt>
<dd>People that seek involvement in a project</dd>
</dl>
<p>A <strong>match</strong> is thus a pair: an idea person that has agreed to work with a developer. The set of matches is thus all feasible combinations of idea people and developers.</p>
<p>The problem of simply matching people may seem trivial. Consider a naive solution: if an idea person exists and a developer is free, then match them. Although this is intuitive (Eager Matching, see <em>Matching Strategies</em>), this does not guarantee <strong>satisfactory results</strong>.</p>
<p>We say that results are satisfactory if we can optimally maximize both the idea person and developer&#8217;s satisfaction with the matching. Thus, we produce an optimization problem:</p>
<p>\[<br />
\max{\sum_{i=0}^{matches} satisfaction(i)}<br />
\]</p>
<p>We must then define criteria for satisfaction. In the context of ICS projects, we say that an idea person is satisfied with the matching if their developer has the skills necessary to solve their problem; additionally, we say that a developer is satisfied with the matching if they have been sufficiently compensated for their work (whether intellectually or monetary).</p>
<h2>The Projects Problem</h2>
<p>The problem now can be easily defined: given a set of idea people and a set of developers, match people such that both parties are satisfied with the matching. As the Projects Coordinator for the Association of Computing Machinery (ACM) at UC Irvine, it is my duty to maximize the satisfactory matchmaking among idea people and developers. However, this problem is difficult to solve. Here is why:</p>
<p>First, idea people are also resource-constrained and seek to maximize developer talent while minimizing the cost. Second, developers seek to maximize their compensation while minimizing work. These two sets of people are playing a zero-sum game against one another (this can be modeled through a Minimax algorithm on a game tree). Certainly, one could argue that if we reduce this problem&#8217;s constraints, it becomes a simple maximization where both parties mutually benefit; however, in reality, there is always a cost whether it is monetary, time or other commitment.</p>
<h3>An Example with Integers</h3>
<p>To illustrate this problem, consider the following integer example:</p>
<p>Consider idea person <em>A</em> seeking a developer talent with minimum-bound skill requirement <em>5</em> i.e. <em>A</em> will not accept developers below the integer requirement. <em>A</em> is willing to offer a compensation of up to <em>2</em>.</p>
<p>Then consider a pool of feasible developers <em>(X, 2, 3)</em>, <em>(Y, 1, 1)</em>, <em>(Z, 6, 8)</em>. The tuple is a representation of <em>(Candidate, Skill, Compensation)</em>. Thus, <em>X</em> has a skill of <em>2</em> and will only work for a compensation of <em>3</em>.  </p>
<p>It is obvious that <em>X</em> and <em>Y</em> are not feasible candidates. Additionally, <em>A</em> may attempt to coerce <em>Z</em>; however, it is unlikely that <em>Z</em> will work for such low compensation.</p>
<p>Therefore, there are no feasible matches in this scenario. Because no matches are made, productivity is stagnant (and non-existent) while both sets continue to grow.</p>
<h2>Feasible Solutions: Matching Strategies</h2>
<p>There are a few standard strategies that are frequently employed among small organizations. Each of these strategies may have variations; however, these are the standard approaches that I have observed as feasible solutions to this problem depending on the objective of the organization.</p>
<h3>Eager Matching (Naive)</h3>
<p>Matches are attempted among all feasible combinations of idea people and developers once they are available. Satisfaction was never considered (or was never guaranteed).</p>
<h4>Benefit</h4>
<p> Rapidly reduces the pool of idea people and developers in wait.</p>
<h4>Cost</h4>
<p> Low satisfaction rate.</p>
<h3>Lazy Matching</h3>
<p>Matches are only made once satisfaction is guaranteed by both parties. Beware that this approach may sometimes appear elitist if only absolutely satisfactory matches are made.</p>
<h4>Benefit</h4>
<p> High satisfaction rate.</p>
<h4>Cost</h4>
<p> Large pool of idea people and developers in wait.</p>
<h3>Role-Based Matching</h3>
<p>Developers (or other types of team members) are distributed into more subcategories (much like a Pigeonhole principle). Idea people then request for an individual of a specified subcategory. Team members are assigned accordingly by a third party (former VGDC process).</p>
<h4>Benefit</h4>
<p> Structured matching process.</p>
<h4>Cost</h4>
<p> Requires that members understand the process as well as the requirements which entail a category/title. If the structure maintains no requirements for each category, then the strategy is reduced to eager matching (with unnecessary structure).</p>
<h3>Mentorship Strategy</h3>
<p>Idea people have a specific problem which has an inheritance property such that new developers of arbitrary skill requirements may improve and later inherit the duties of idea people.</p>
<h4>Benefit</h4>
<p> Long-term investment of idea people and developers</p>
<h4>Cost</h4>
<p> Specific to projects that can afford long-term investments; additionally, risk is involved such that new developers are lost after long-term investment.</p>
<h2>Discussion of ACM&#8217;s Lazy Strategy</h2>
<p>ACM&#8217;s current strategy is lazy matching. Although incompatible matches frequently disappear (due to wait), there is significant involvement as well as progress for the few projects established through ACM. </p>
<h4>Benefit</h4>
<p> Consider the <a href="acmuci.com">ACM website</a> which, under a five-day time constraint, was well-made. Additionally, consider ICPC  practice sessions, which, with only a quarter of practice, enabled a first-year student to achieve first place in a local competition against undergraduates of all levels i.e. third- and fourth-years.</p>
<h4>Cost</h4>
<p> Now, we consider the cost of such achievements through lazy matches. First, ACM has a significantly small board as well as a small, general member base. Second, ACM must shoulder its own financial weight, for we do not appeal with a breadth of talent as other clubs may. As a consequence, we sacrifice aggressive marketing tactics and corporate event opportunities.</p>
<h3>General Strategy Discussion</h3>
<p>Although I may continue to discuss the strategic positions of other organizations such as VGDC and ICSSC, they are sensitive topics of debate that I will not discuss without approval. </p>
<p>Instead, I will generalize that with every strategy, advantages are weighed by sacrifices (once optimal utility has been achieved). Thus, ACM sacrifices a large audience in favor of a handful of capable individuals. </p>
<p>We intend for ACM&#8217;s culture to be established by its few members, for each member has a duty to algorithms, theory and intellectual discussion.</p>
<h2>Further Discussion</h2>
<p>Certainly, there was much left out such as other criteria for satisfaction, the concept of <em>wait</em>, skill requirements, effective Pigeonholing, strategic implications and the simple concept of time.</p>
<p>If you&#8217;re interested such a topic, find me in the Irvine wild and perhaps we may have a chat.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/projects-matching-problem-of-ics-clubs-and-small-organizations.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Historical Problems with Closures in JavaScript and Python</title>
		<link>http://www.giocc.com/problems-with-closures-in-javascript-and-python.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=problems-with-closures-in-javascript-and-python</link>
		<comments>http://www.giocc.com/problems-with-closures-in-javascript-and-python.html#comments</comments>
		<pubDate>Wed, 08 Aug 2012 15:04:21 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Inspiration]]></category>
		<category><![CDATA[closure]]></category>
		<category><![CDATA[dependency-inversion-principle]]></category>
		<category><![CDATA[functional programming]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[lexical-closure]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1194</guid>
		<description><![CDATA[Closures are necessary features for supporting the functional programming paradigm. A closure is an inner function has access to the variables defined in the environment of its outer function. They can be found in almost all modern dynamic programming includes including JavaScript, Python and Ruby; however, Python and JavaScript have both made their own mistakes [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Closures</strong> are necessary features for supporting the functional programming paradigm. A closure is an inner function has access to the variables defined in the environment of its outer function. They can be found in almost all modern dynamic programming includes including JavaScript, Python and Ruby; however, Python and JavaScript have both made their own mistakes in the initial implementation of the feature.</p>
<p><span id="more-1194"></span></p>
<h2>Scope of Closures</h2>
<p>Closures are composed with two environments: the inner and outer. The inner environment is where new, local variables are defined. The outer environment is where variables live that existed during the creation of the closure itself.</p>
<div id="attachment_1228" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/ClosureScopes.png"><img src="http://www.giocc.com/wp-content/uploads/ClosureScopes.png" alt="Closure Scopes" title="Closure Scopes" width="360" height="259" class="size-full wp-image-1228" /></a><p class="wp-caption-text">Closure Scopes</p></div>
<p>Closures have access to both environments and thus have access to variables from both the inner and outer scopes. In pseudocode, it may have the following structure.</p>
<pre><code class="no-highlight">procedure outer():
    procedure inner():
        # Do stuff
    return inner</code></pre>
<p>The problems identified here are specific to the scope of closures.</p>
<h2>This JavaScript Problem</h2>
<p>I first encountered this problem while sifting through Douglas Crockford&#8217;s book, <em>JavaScript the Good Parts</em>, in the section regarding function invocation. He explains that the <code>this</code> keyword of functions are automatically bound to the global scope, the <code>window</code> object,  when the function is not a method.</p>
<p>This is troublesome because closures cannot have access to the outer function&#8217;s <code>this</code> keyword without some hacking.</p>
<pre><code class="javascript">var obj = {};
obj.outer = function() {
	function inner() {
		console.log(this);
	}
	return inner;
}

var fn = outer();
fn();
</code></pre>
<p>With the above code, the console unsurprisingly logs the <code>Window</code> object. This strange functionality is due to the ECMAScript 5 specification on function calls.</p>
<blockquote><div class="icon quote"></div>
<p>in the HTML document object model the window property of the global object is the global object itself.</p></blockquote>
<p>So, it isn&#8217;t necessarily incorrect that JavaScript binds the <code>this</code> keyword to <code>window</code>, for calling the function <code>fn()</code> is equivalent to its dependency injection variant, <code>fn.call(window)</code>. For closures, however, we should not automatically override the scope of variables with the global scope, especially <code>this</code> of the outer function.</p>
<p>The quick and classical fix to this problem simply involves aliasing the object to a local variable of the outer environment instead.</p>
<pre><code class="javascript">var obj = {};
obj.outer = function() {
	var that = this;
	function inner() {
		console.log(that);
	}
	return inner;
}

var fn = outer();
fn();
</code></pre>
<p>The output now returns the <code>Object</code> as expected.</p>
<h2>Python&#8217;s Lexical Scope Problem</h2>
<p>Closures in Python 2.x could not change <em>nonlocal</em> variables. It was strange that variables in the outer scope could be accessed, but they could not be changed. This was an inherent problem in the lexical scoping rules of Python where names could only be bound to the local scope or global scope.</p>
<blockquote><div class="icon idea"></div>
<p>Since Python 3, this problem was fixed though it&#8217;s still noteworthy of discussion. See <a href="http://www.python.org/dev/peps/pep-3104/" title="PEP 3104">PEP 3104</a>.</p></blockquote>
<p>Consider the following example which increments a counter whenever the function is called.</p>
<pre><code class="python">def counter():
    count = 0;
    def inner():
        count += 1
        return count
    return inner
</code></pre>
<p>The above example will yield an error when executed:</p>
<pre><code class="no-highlight">>>> countr = counter()
>>> countr()
UnboundLocalError: local variable 'count' referenced before assignment
</code></pre>
<p>Again, the error occurs due to Python&#8217;s lexical scoping. The local variable, <code>count</code> is not initialized in the inner scope. We expect the variable to be initialized in the outer scope instead; however, this is not the case and the variable remains unbound.</p>
<p>Fortunately, Python 3 added the <code>nonlocal</code> keyword which enables closures to change variables in their outer environment.</p>
<pre><code class="python">def counter():
    count = 0;
    def inner():
        nonlocal count
        count += 1
        return count
    return inner
</code></pre>
<p>Now, the <code>counter()</code> works as expected:</p>
<pre><code class="no-highlight">>>> countr = counter()
>>> countr()
1
>>> countr()
2</code></pre>
<h2>Final Thoughts</h2>
<p>These problems only serve to highlight the inherent difficulties with language design. Although it&#8217;s a craft tempered throughout the years, programming languages are still far from perfect. It is especially difficult with multiparadigm languages such as JavaScript and Python because of they must cater for each programming paradigm with precision.</p>
<p>I have no complaints for Ruby because I&#8217;m not as familiar with it as these two languages, though if anyone else has opinion on it, please free free to share.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/problems-with-closures-in-javascript-and-python.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing a Lexer in Java 1.7 using Regex Named Capturing Groups</title>
		<link>http://www.giocc.com/writing-a-lexer-in-java-1-7-using-regex-named-capturing-groups.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=writing-a-lexer-in-java-1-7-using-regex-named-capturing-groups</link>
		<comments>http://www.giocc.com/writing-a-lexer-in-java-1-7-using-regex-named-capturing-groups.html#comments</comments>
		<pubDate>Sun, 29 Jul 2012 10:33:38 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Ingenuity]]></category>
		<category><![CDATA[java-1.7]]></category>
		<category><![CDATA[lexer]]></category>
		<category><![CDATA[lexical analysis]]></category>
		<category><![CDATA[named capturing]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1164</guid>
		<description><![CDATA[One of my favorite features in the new Java 1.7 aside from the try-with-resources statement are named capturing groups in the regular expression API. Although, captured groups can be referenced numerically in the order of which they are declared from left to right, named capturing makes this more intuitive as I will demonstrate in the [...]]]></description>
			<content:encoded><![CDATA[<p>One of my favorite features in the new Java 1.7 aside from the <code>try-with-resources</code> statement are named capturing groups in the regular expression API. Although, captured groups can be referenced numerically in the order of which they are declared from left to right, named capturing makes this more intuitive as I will demonstrate in the construction of a lexer.</p>
<p><span id="more-1164"></span></p>
<h2>Lexer, a Definition</h2>
<p>To describe lexers, we must first describe a <strong>tokenizer</strong>. Tokenizers simply break up strings into a set of tokens which are, of course, more strings. Subsequently, a lexer is a type of tokenizer that adds a context to the tokens such as the type of token extracted e.g. <code>(NUMBER 1234)</code> whereas a simple token would be <code>1234</code>. Lexers are important for parsing languages, however, that is a discussion beyond the scope of this tutorial.</p>
<p>For example, given the input <code>"11 + 22 + 33"</code>, we should receive the following tokens from a lexer:</p>
<pre><code class="no-highlight">(NUMBER 11)
(BINARYOP +)
(NUMBER 22)
(BINARYOP -)
(NUMBER 33)</code></pre>
<blockquote><div class="icon idea"></div>
<p>Note that <code>BINARYOP</code> refers to <em>binary operator</em>. Binary operators includes any operator that accepts two arguments. The archetypal example is addition which accepts two numbers, one to the left and the other to the right of the operator.</p></blockquote>
<h2>Setting-Up the Program</h2>
<p>The input is a sentence to be scanned. For this tutorial, we will scan a simple arithmetic grammar that includes addition, multiplication and subtraction. Consequently, we will parse the following input:</p>
<pre><code class="java">public class Lexer {
	public static void main(String[] args) {
		String input = "11 + 22 - 33";
	}
}
</code></pre>
<p>Next, we must define the types of the tokens that we are extracting and the regular expression that they match.</p>
<dl>
<dt>Number</dt>
<dd><code>-?[0-9]+</code> Matches negative infinity to positive infinity without decimals.</dd>
<dt>Binary Operator</dt>
<dd><code>[*|/|+|-]</code> Matches any standard arithmetic operators.</dd>
<dt>Whitespace</dt>
<dd><code>[ \t\f\r\n]+</code> Matches whitespace, tabs, form feeds or newlines in a sequence. Will be skipped.</dd>
</dl>
<pre><code class="java">public class Lexer {
	public static enum TokenType {
		// Token types cannot have underscores
		NUMBER("-?[0-9]+"), BINARYOP("[*|/|+|-]"), WHITESPACE("[ \t\f\r\n]+");

		public final String pattern;

		private TokenType(String pattern) {
			this.pattern = pattern;
		}
	}

	public static void main(String[] args) {
		String input = "11 + 22 - 33";
	}
}
</code></pre>
<blockquote><div class="icon idea"></div>
<p>Enumerations in Java can only have <code>private</code> constructors because there is only a finite set of objects created at run-time. Consequently, their data fields are frequently declared as <code>final</code>.</p></blockquote>
<p>Finally, we declare a data structure for holding the token data. Additionally, I will override the <code>toString()</code> method for printing out the token&#8217;s contextual data at the end of this tutorial in the format I have mentioned earlier: <code>(&lt;TYPE&gt; &lt;DATA&gt;)</code>.</p>
<pre><code class="java">public class Lexer {
	public static enum TokenType {
		// Token types cannot have underscores
		NUMBER("-?[0-9]+"), BINARYOP("[*|/|+|-]"), WHITESPACE("[ \t\f\r\n]+");

		public final String pattern;

		private TokenType(String pattern) {
			this.pattern = pattern;
		}
	}

	public static class Token {
		public TokenType type;
		public String data;

		public Token(TokenType type, String data) {
			this.type = type;
			this.data = data;
		}

		@Override
		public String toString() {
			return String.format("(%s %s)", type.name(), data);
		}
	}

	public static void main(String[] args) {
		String input = "11 + 22 - 33";
	}
}
</code></pre>
<p>Now that we have our input, token types and data structure for tokens, we may begin lexical analysis of the input string into a set of tokens with its corresponding token type.</p>
<h2>Lexical Analysis with Regular Expressions</h2>
<p>We begin by framing our lexical analysis method as <code>lex()</code>, a function which returns a list of <code>Token</code> objects. Additionally, we will need to import <code>ArrayList</code> in order to store the <code>Token</code> objects into the list.</p>
<pre><code class="java">import java.util.ArrayList;

public class Lexer {
	public static enum TokenType {
		// Token types cannot have underscores
		NUMBER("-?[0-9]+"), BINARYOP("[*|/|+|-]"), WHITESPACE("[ \t\f\r\n]+");

		public final String pattern;

		private TokenType(String pattern) {
			this.pattern = pattern;
		}
	}

	public static class Token {
		public TokenType type;
		public String data;

		public Token(TokenType type, String data) {
			this.type = type;
			this.name = data;
		}

		@Override
		public String toString() {
			return String.format("(%s %s)", type.name(), data);
		}
	}

	public static ArrayList&lt;Token&gt; lex(String input) {
		// The tokens to return
		ArrayList&lt;Token&gt; tokens = new ArrayList&lt;Token&gt;();

		// Lexer logic begins here

		return tokens;
	}

	public static void main(String[] args) {
		String input = "11 + 22 - 33";

		// Create tokens and print them
		ArrayList&lt;Token&gt; tokens = lex(input);
		for (Token token : tokens)
			System.out.println(token);
	}
}
</code></pre>
<p>Now, we need to encode all of the regular expression patterns for each of the token types into a single pattern in the algorithm shown below. This is the case where we use <strong>named capturing groups</strong> in regular expressions as <code>(?&lt;TYPE&gt; PATTERN)</code> so that once a pattern is matched, we can retrieve the token by calling its group name, the <code>TYPE</code>.</p>
<p>Additionally, we import the <code>Pattern</code> class to compile regular expression patterns.</p>
<pre><code class="java">import java.util.regex.Pattern;

public static ArrayList&lt;Token&gt; lex(String input) {
	// The tokens to return
	ArrayList&lt;Token&gt; tokens = new ArrayList&lt;Token&gt;();

	// Lexer logic begins here
	StringBuffer tokenPatternsBuffer = new StringBuffer();
	for (TokenType tokenType : TokenType.values())
		tokenPatternsBuffer.append(String.format("|(?&lt;%s&gt;%s)", tokenType.name(), tokenType.pattern));
	String tokenPatterns = Pattern.compile(new String(tokenPatternsBuffer.substring(1)));

	return tokens;
}
</code></pre>
<p>Next, we begin tokenizing by creating a <code>Matcher</code> object from the compiled pattern, <code>tokenPatterns</code>, from earlier. The matcher will return any token matched with any of the corresponding token type patterns. Note that we must also import the <code>Matcher</code> class here.</p>
<p>We will iterate through the list of token types and ask if the token type was matched. If the token returns a match, we will add it to our list of tokens with the corresponding token type and continue parsing the input.</p>
<pre><code class="java">import java.util.regex.Pattern;
import java.util.regex.Matcher;

public static ArrayList&lt;Token&gt; lex(String input) {
	// The tokens to return
	ArrayList&lt;Token&gt; tokens = new ArrayList&lt;Token&gt;();

	// Lexer logic begins here
	StringBuffer tokenPatternsBuffer = new StringBuffer();
	for (TokenType tokenType : TokenType.values())
		tokenPatternsBuffer.append(String.format("|(?&lt;%s&gt;%s)", tokenType.name(), tokenType.pattern));
	Pattern tokenPatterns = Pattern.compile(new String(tokenPatternsBuffer.substring(1)));

	// Begin matching tokens
	Matcher matcher = tokenPatterns.matcher(input);
	while (matcher.find()) {
		if (matcher.group(TokenType.NUMBER.name()) != null) {
			tokens.add(new Token(TokenType.NUMBER, matcher.group(TokenType.NUMBER.name())));
			continue;
		} else if (matcher.group(TokenType.BINARYOP.name()) != null) {
			tokens.add(new Token(TokenType.BINARYOP, matcher.group(TokenType.BINARYOP.name())));
			continue;
		} else if (matcher.group(TokenType.WHITESPACE.name()) != null)
			continue;
	}

	return tokens;
}
</code></pre>
<p>And the algorithm is complete! The magic of named capturing groups here happens as we match of the token types. Note that instead of matching groups by their numerical reference, <code>matcher.group(0)</code>, we use the actual name which is far more intuitive and much easier to maintain.</p>
<p>Here is the complete source code:</p>
<pre><code class="java">import java.util.ArrayList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Lexer {
	public static enum TokenType {
		// Token types cannot have underscores
		NUMBER("-?[0-9]+"), BINARYOP("[*|/|+|-]"), WHITESPACE("[ \t\f\r\n]+");

		public final String pattern;

		private TokenType(String pattern) {
			this.pattern = pattern;
		}
	}

	public static class Token {
		public TokenType type;
		public String data;

		public Token(TokenType type, String data) {
			this.type = type;
			this.data = data;
		}

		@Override
		public String toString() {
			return String.format("(%s %s)", type.name(), data);
		}
	}

	public static ArrayList&lt;Token&gt; lex(String input) {
		// The tokens to return
		ArrayList&lt;Token&gt; tokens = new ArrayList&lt;Token&gt;();

		// Lexer logic begins here
		StringBuffer tokenPatternsBuffer = new StringBuffer();
		for (TokenType tokenType : TokenType.values())
			tokenPatternsBuffer.append(String.format("|(?&lt;%s&gt;%s)", tokenType.name(), tokenType.pattern));
		Pattern tokenPatterns = Pattern.compile(new String(tokenPatternsBuffer.substring(1)));

		// Begin matching tokens
		Matcher matcher = tokenPatterns.matcher(input);
		while (matcher.find()) {
			if (matcher.group(TokenType.NUMBER.name()) != null) {
				tokens.add(new Token(TokenType.NUMBER, matcher.group(TokenType.NUMBER.name())));
				continue;
			} else if (matcher.group(TokenType.BINARYOP.name()) != null) {
				tokens.add(new Token(TokenType.BINARYOP, matcher.group(TokenType.BINARYOP.name())));
				continue;
			} else if (matcher.group(TokenType.WHITESPACE.name()) != null)
				continue;
		}

		return tokens;
	}

	public static void main(String[] args) {
		String input = "11 + 22 - 33";

		// Create tokens and print them
		ArrayList&lt;Token&gt; tokens = lex(input);
		for (Token token : tokens)
			System.out.println(token);
	}
}
</code></pre>
<h3>Running the Algorithm</h3>
<p>For completeness, when we run the program, we should receive the following output.</p>
<pre><code class="no-highlight">(NUMBER 11)
(BINARYOP +)
(NUMBER 22)
(BINARYOP -)
(NUMBER 33)</code></pre>
<h2>Conclusion</h2>
<p>Although lexical analysis is doable without it, named capturing groups in regular expressions certainly makes the code more intuitive and easier to maintain. It&#8217;s also nice that Java is beginning to provide features that act as syntactic sugar.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/writing-a-lexer-in-java-1-7-using-regex-named-capturing-groups.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Underscorejs: Text Processing on the Document Object Model (DOM)</title>
		<link>http://www.giocc.com/underscorejs-text-processing-on-the-document-object-model-dom.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=underscorejs-text-processing-on-the-document-object-model-dom</link>
		<comments>http://www.giocc.com/underscorejs-text-processing-on-the-document-object-model-dom.html#comments</comments>
		<pubDate>Fri, 25 May 2012 21:47:23 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Inspiration]]></category>
		<category><![CDATA[dom]]></category>
		<category><![CDATA[functional programming]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[lipsum]]></category>
		<category><![CDATA[underscorejs]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1135</guid>
		<description><![CDATA[Last time, we dove into higher-order functions that Underscorejs provides. This time, we&#8217;ll be utilizing those higher-order functions to process text on the DOM of a page. The Problem Given three paragraphs of text in HTML, find and print all big words within the text. Big words shall be defined by anything with more than [...]]]></description>
			<content:encoded><![CDATA[<p>Last time, we dove into <a href="http://www.giocc.com/prelude-into-underscorejs-higher-order-functions.html">higher-order functions</a> that Underscorejs provides. This time, we&#8217;ll be utilizing those higher-order functions to process text on the DOM of a page.</p>
<p><span id="more-1135"></span></p>
<h2>The Problem</h2>
<p>Given three paragraphs of text in HTML, find and print all big words within the text. Big words shall be defined by anything with more than eleven characters. The initial set of paragraphs will be defined within an <code>&lt;article&gt;</code> block identified by <code>lipsum</code>. The solution should be printed to the <code>&lt;ul&gt;</code> identified by <code>long_words</code>.</p>
<h2>Algorithm Outline</h2>
<p>We want to outline the algorithm before we begin programming so that we&#8217;re aware of our situation, our objective and how to achieve that objective using our algorithm.</p>
<ol>
<li><strong>Extract</strong> the text from the paragraphs under the <code>&lt;article&gt;</code>.</li>
<li><strong>Union</strong> the paragraphs into a single block of text.</li>
<li><strong>Filter</strong> for all words longer than eleven characters.</li>
<li><strong>Print</strong> the solution of long words into the DOM under the <code>&lt;ul&gt;</code></li>
</ol>
<h2>Context of the Problem</h2>
<p>I have written the problem into an HTML5 document to demonstrate the structure as per problem definition. Note here that I am using HTML5 instead of HTML4 or XHTML2.0. Additionally, we use lipsum are our boilerplate.</p>
<pre><code class="html">&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;Prelude into Underscorejs: Text Processing on the DOM&lt;/head&gt;
&lt;body&gt;
&lt;article id="lipsum"&gt;
	&lt;h2&gt;Three Paragraphs of Lipsum&lt;/h2&gt;

	&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit.
	Pellentesque vitae lectus at augue adipiscing facilisis in et
	dolor. Maecenas semper scelerisque blandit. Sed ut nibh eget
	purus aliquam suscipit. Nulla in facilisis leo. Etiam augue
	ligula, blandit et volutpat eget, ultrices hendrerit libero.
	Curabitur facilisis tincidunt neque, ornare viverra dui imperdiet
	ut. Suspendisse rhoncus, diam ut congue pharetra, metus tellus
	vehicula tellus, id imperdiet leo sem quis urna.&lt;p&gt;

	&lt;p&gt;Proin massa odio, malesuada quis aliquet suscipit, porta in
	arcu. Nunc ac egestas metus. Sed ac ligula vel neque molestie
	consectetur. Sed ac lacus nulla, sollicitudin interdum arcu.
	Quisque neque elit, hendrerit at mollis id, porta nec sem. In
	dapibus convallis ligula sed laoreet. Nulla non leo turpis. Sed
	dictum magna sit amet neque gravida malesuada. Donec pulvinar
	aliquam nisi, at malesuada libero posuere in. Donec ullamcorper
	accumsan eros nec interdum. Nam pharetra purus eget quam auctor
	nec placerat elit varius. Suspendisse imperdiet vehicula elit, at
	consequat dolor feugiat ut. Vestibulum ac suscipit augue.
	Curabitur scelerisque sollicitudin nisl nec sodales. Sed sed
	placerat ligula.&lt;p&gt;

	&lt;p&gt;Phasellus cursus sagittis augue, sit amet rutrum felis
	adipiscing vel. Pellentesque suscipit posuere sollicitudin. Proin
	pretium enim vel diam lobortis vel ullamcorper purus auctor.
	Vestibulum quis orci sem, nec vulputate arcu. Sed pretium
	facilisis ullamcorper. Curabitur placerat libero et quam rhoncus
	varius. Mauris bibendum felis non mi tincidunt id congue urna
	bibendum.&lt;/p&gt;
&lt;article&gt;

&lt;article&gt;
	&lt;h2&gt;Long Words&lt;/h2&gt;
	&lt;p&gt;In this case, we define &lt;em&gt;long words&lt;/em&gt; as words with
	more than eleven characters. After pulling out these long
	words, we may provide definitions for they may be new to a
	reader's vocabulary.&lt;/p&gt;
	&lt;ul id="long_words"&gt;&lt;/ul&gt;
&lt;/article&gt;
&lt;/body&gt;
&lt;/html&gt;
</code></pre>
<h2>Walking through the Underscorejs Algorithm</h2>
<p>According to our algorithm, we must first <strong>extract</strong> the text from the paragraphs within the <code>&lt;article&gt;</code> defined by <code>lipsum</code>.</p>
<p>We can start by gaining control of the <code>&lt;article&gt;</code> tag through <code>document.getElementById</code>. Afterwards, we need to acquire a list of all paragraph nodes under our article. Underscorejs easily allows us to do this by filtering <code>childNodes</code> of the article against their <code>nodeName</code>; specifically, we filter against the <code>nodeName</code> of the <code>&lt;p&gt;</code> tag. Beware: all HTML elements have capitalized names.</p>
<pre><code class="javascript">function processLipsum() {
	// Get the paragraph nodes of the article
	var lipsumArticle = document.getElementById('lipsum');
	var lipsumParagraphs = _.filter(lipsumArticle.childNodes,
		function(node) {
			return node.nodeName === 'P';
		});
};

document.addEventListener("DOMContentLoaded", processLipsum, false);</code></pre>
<blockquote><div class="icon idea"></div>
<p>Because our algorithm is dependent upon the DOM contents, we must execute our code once the entire DOM content has loaded; consequently, we bind the beginning of our function to the <code>DOMCOntentLoaded</code> event.</p></blockquote>
<p>At this point, we have a list of all <code>&lt;p&gt;</code> tags that are children of the containing <code>&lt;article&gt;</code>.</p>
<p>Now we can simply start <code>_.chain</code> the list of paragraphs and process it through a set of higher-order functions sequentially to acquire our new list of big words.</p>
<h3>Union the List of Paragraphs</h3>
<p>First, we need to coalesce the paragraphs&#8217; text into a single list. We can do this by first replacing all non-alphanumeric symbols and extraneous spaces with a single space through <code>replace</code>. Afterwards, we simply <code>split</code> the paragraph according to spaces such that only words are given to us in our new list.</p>
<p>This list processing can be done with <code>_.map</code> by mapping the list of paragraph to a list of words within the specified paragraphs.</p>
<p>Second, once the list of words have been generated, it is a <em>deep list</em> i.e. it contains a list of list of words originally within our paragraphs. This can easily be subverted by <code>_.flatten</code> which turns our list into a <em>shallow list</em> i.e. inner lists coalesce into the containing list.</p>
<p>Now, our paragraphs should have been successfully <strong>unioned</strong> into a list words as shown below.</p>
<pre><code class="javascript">function processLipsum() {
	// Get the paragraph nodes of the article
	var lipsumArticle = document.getElementById('lipsum');
	var lipsumParagraphs = _.filter(lipsumArticle.childNodes,
		function(node) {
			return node.nodeName === 'P';
		});

	var lipsumWords = _.chain(lipsumParagraphs)
		// Reduce all extranneous whitespace and symbols to a single space.
		// Then split the words by space.
		.map(function(node) {
			return node.innerHTML.replace(/[\W\s]+/g, ' ').split(' ');
			})
		// Union all subarrays into a large array of words.
		.flatten()
		// Return the flattened list of words
		.value();
};

document.addEventListener("DOMContentLoaded", processLipsum, false);</code></pre>
<h3>Filter the List of Words</h3>
<p>This section is trivial though necessary to our objective: <strong>filter</strong> the following list to return all big words which have more than eleven characters. Simply, we will chain the <code>_.filter</code> function further and return the final list of words by ending our <code>_.chain</code> with <code>_.value</code>.</p>
<pre><code class="javascript">function processLipsum() {
	// Get the paragraph nodes of the article
	var lipsumArticle = document.getElementById('lipsum');
	var lipsumParagraphs = _.filter(lipsumArticle.childNodes,
		function(node) {
			return node.nodeName === 'P';
		});

	var lipsumWords = _.chain(lipsumParagraphs)
		// Reduce all extranneous whitespace and symbols to a single space.
		// Then split the words by space.
		.map(function(node) {
			return node.innerHTML.replace(/[\W\s]+/g, ' ').split(' ');
			})
		// Union all subarrays into a large array of words.
		.flatten()
		// Find all words with more than eleven characters.
		.filter(function(word) {
			return word.length > 11;
		})
		// Return the filtered list of words.
		.value();
};

document.addEventListener("DOMContentLoaded", processLipsum, false);</code></pre>
<p>We have successfully acquired the list of big words. Now, the final task is to print the list into the DOM.</p>
<h3>Printing to the DOM</h3>
<p>Instead of polluting our function space, we will delegate the printing to an auxiliary function. In this function, we must create a document fragment to begin storing new <code>&lt;li&gt;</code> elements.</p>
<p>For each of the words we pass to the function, it will create a new list element, insert a long word into it as a text node and append it the document fragment. <code>_.each</code> does this for us trivially.</p>
<p>Finally, we append the list of list elements to the unordered list identified by <code>long_words</code>.</p>
<pre><code class="javascript">function printLongWords(words) {
	var fragment = document.createDocumentFragment();
	var listElm = null;
	_.each(words, function(word) {
		listElm = document.createElement('LI');
		listElm.appendChild(document.createTextNode(word));
		fragment.appendChild(listElm);
	});
	document.getElementById('long_words').appendChild(fragment.cloneNode(true));
};</code></pre>
<p>Voila! We have successfully printed to the DOM.</p>
<h3>Putting It All Together Now</h3>
<pre><code class="html">&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;Prelude into Underscorejs: Text Processing on the DOM&lt;/head&gt;
&lt;script type="text/javascript" src="underscore.js"&gt;&lt;/script&gt;
&lt;script type="text/javascript"&gt;
function processLipsum() {
	// Get the paragraph nodes of the article
	var lipsumArticle = document.getElementById('lipsum');
	var lipsumParagraphs = _.filter(lipsumArticle.childNodes,
		function(node) {
			return node.nodeName === 'P';
		});

	var lipsumWords = _.chain(lipsumParagraphs)
		// Reduce all extranneous whitespace and symbols to a single space.
		// Then split the words by space.
		.map(function(node) {
			return node.innerHTML.replace(/[\W\s]+/g, ' ').split(' ');
			})
		// Union all subarrays into a large array of words.
		.flatten()
		// Find all words with more than eleven characters.
		.filter(function(word) {
			return word.length > 11;
		})
		// Return the list of words.
		.value();
	printLongWords(lipsumWords);
};

function printLongWords(words) {
	var fragment = document.createDocumentFragment();
	var listElm = null;
	_.each(words, function(word) {
		listElm = document.createElement('LI');
		listElm.appendChild(document.createTextNode(word));
		fragment.appendChild(listElm);
	});
	document.getElementById('long_words').appendChild(fragment.cloneNode(true));
};

document.addEventListener("DOMContentLoaded", processLipsum, false);
&lt;/script&gt;
&lt;body&gt;
&lt;article id="lipsum"&gt;
	&lt;h2&gt;Three Paragraphs of Lipsum&lt;/h2&gt;

	&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit.
	Pellentesque vitae lectus at augue adipiscing facilisis in et
	dolor. Maecenas semper scelerisque blandit. Sed ut nibh eget
	purus aliquam suscipit. Nulla in facilisis leo. Etiam augue
	ligula, blandit et volutpat eget, ultrices hendrerit libero.
	Curabitur facilisis tincidunt neque, ornare viverra dui imperdiet
	ut. Suspendisse rhoncus, diam ut congue pharetra, metus tellus
	vehicula tellus, id imperdiet leo sem quis urna.&lt;p&gt;

	&lt;p&gt;Proin massa odio, malesuada quis aliquet suscipit, porta in
	arcu. Nunc ac egestas metus. Sed ac ligula vel neque molestie
	consectetur. Sed ac lacus nulla, sollicitudin interdum arcu.
	Quisque neque elit, hendrerit at mollis id, porta nec sem. In
	dapibus convallis ligula sed laoreet. Nulla non leo turpis. Sed
	dictum magna sit amet neque gravida malesuada. Donec pulvinar
	aliquam nisi, at malesuada libero posuere in. Donec ullamcorper
	accumsan eros nec interdum. Nam pharetra purus eget quam auctor
	nec placerat elit varius. Suspendisse imperdiet vehicula elit, at
	consequat dolor feugiat ut. Vestibulum ac suscipit augue.
	Curabitur scelerisque sollicitudin nisl nec sodales. Sed sed
	placerat ligula.&lt;p&gt;

	&lt;p&gt;Phasellus cursus sagittis augue, sit amet rutrum felis
	adipiscing vel. Pellentesque suscipit posuere sollicitudin. Proin
	pretium enim vel diam lobortis vel ullamcorper purus auctor.
	Vestibulum quis orci sem, nec vulputate arcu. Sed pretium
	facilisis ullamcorper. Curabitur placerat libero et quam rhoncus
	varius. Mauris bibendum felis non mi tincidunt id congue urna
	bibendum.&lt;/p&gt;
&lt;article&gt;

&lt;article&gt;
	&lt;h2&gt;Long Words&lt;/h2&gt;
	&lt;p&gt;In this case, we define &lt;em&gt;long words&lt;/em&gt; as words with
	more than eleven characters. After pulling out these long
	words, we may provide definitions for they may be new to a
	reader's vocabulary.&lt;/p&gt;
	&lt;ul id="long_words"&gt;&lt;/ul&gt;
&lt;/article&gt;
&lt;/body&gt;
&lt;/html&gt;
</code></pre>
<h2>Conclusion</h2>
<p>Higher-order functions significantly simplify the processing of stream-based or list-based data such as text. </p>
<p>Using JavaScript&#8217;s interface into the DOM, we were able to show how functional programming (though impure) may have a place on the web to simplify common computational tasks and even DOM tasks.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/underscorejs-text-processing-on-the-document-object-model-dom.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prelude into Underscorejs: Higher-Order Functions</title>
		<link>http://www.giocc.com/prelude-into-underscorejs-higher-order-functions.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=prelude-into-underscorejs-higher-order-functions</link>
		<comments>http://www.giocc.com/prelude-into-underscorejs-higher-order-functions.html#comments</comments>
		<pubDate>Sun, 20 May 2012 23:11:28 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Inspiration]]></category>
		<category><![CDATA[documentcloud]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[higher-order functions]]></category>
		<category><![CDATA[immutability]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[underscorejs]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=1096</guid>
		<description><![CDATA[DocumentCloud&#8217;s project, Underscorejs, interested me for a while now since I&#8217;ve seen its beautiful documentation produced through Docco. Additionally, while playing with higher-order functions on Haskell, I wanted to see how they may simplify a standard set of computational tasks that I would normall write in C. Fortunately, JavaScript allows me to do this in [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.documentcloud.org/opensource">DocumentCloud&#8217;s</a> project, <a href="http://underscorejs.org/">Underscorejs</a>, interested me for a while now since I&#8217;ve seen its beautiful documentation produced through Docco.</p>
<p>Additionally, while playing with <strong>higher-order functions</strong> on Haskell, I wanted to see how they may simplify a standard set of computational tasks that I would normall write in C. Fortunately, JavaScript allows me to do this in a multi-paradigm fashion.</p>
<p><span id="more-1096"></span></p>
<h3>Wait! What Are These Magical, Higher-Order Functions?</h3>
<p>Well, I&#8217;ll turn to the pure, functional language, Haskell, to describe it correctly:</p>
<blockquote cite="http://www.haskell.org/haskellwiki/Higher_order_function"><div class="icon quote"></div>
<p>A higher-order function is a function that takes other functions as arguments or returns a function as result.</p></blockquote>
<h2>Standard Computational Task</h2>
<p>The situation: a professor has just finished grading the final exams out of a hundred for his eight-student classroom. He would like to know the top scores (above 80), the highest score (maximum) and the average of the scores.</p>
<p><strong>Class Scores:</strong> <code>[98, 76, 45, 84, 32, 55, 88, 76]</code></code></p>
<p>From the problem description, we have the following set of tasks with constraints:</p>
<ol>
<li>Retrieve all scores above eighty.</li>
<li>Retrieve the maximum score from the list.</li>
<li>Retrieve the running sum of the scores and divide it by the number of scores (mean).</li>
</ol>
<p>Now that we have described the problem, let's attempt a standard solution with C.</p>
<h2>C-Style Solution</h2>
<p>The following solution does not have the immutable array data structure; however, I ensured that there were no side-effects which may mutate the original array after calculations. Instead, all solutions are written to a new result variable as indicated by <code>Calculation results</code> section.</p>
<p>Furthermore, I fissioned the calculation loops to represent how Underscorejs implements their higher-order functions. Note that it would have been optimal if these operations were fused instead.</p>
<blockquote><div class="icon idea"></div>
<p>Loop fission (or loop distribution) is a possible compiler optimization which utilizes the locality of reference.</p></blockquote>
<pre><code class="cpp">#include &lt;stdio.h&gt;
#define NUM_SCORES 8

int main(int argc, char* argv[]) {
	// Calculation results
	int top_scores[NUM_SCORES];
	int highest_score = 0;
	int mean = 0;

	// Input data
	int scores[NUM_SCORES];
	scores[0] = 98;
	scores[1] = 76;
	scores[2] = 45;
	scores[3] = 84;
	scores[4] = 32;
	scores[5] = 55;
	scores[6] = 88;
	scores[7] = 76;

	// Iterators
	int i;
	int top_scores_counter = 0;
	int running_sum = 0; // Memo

	// Calculate the top scores above 80
	for (i = 0; i < NUM_SCORES; ++i)
		if (scores[i] > 80)
			top_scores[top_scores_counter++] = scores[i];

	// Calculate the highest score
	for (i = 0; i < NUM_SCORES; ++i)
		if (scores[i] > highest_score)
			highest_score = scores[i];

	// Calculate the running sum
	for (i = 0; i < NUM_SCORES; ++i)
		running_sum += scores[i];
	// Calculate the mean after the finding the running sum
	mean = running_sum / NUM_SCORES;

	// Print the data
	printf("Top Scores: ");
	for (i = 0; i < top_scores_counter; ++i)
		printf("%d ", top_scores[i]);
	printf("\n");

	printf("Highest Score: %d \n", highest_score);
	printf("Average Score: %d \n", mean);

	return 0;
}
</code></pre>
<p>When executed, the above code should print the following results to the standard output:</p>
<pre><code class="no-highlight">Top Scores: 98 84 88 
Highest Score: 98 
Average Score: 69</code></pre>
<p>Disregarding the scaffolding here, the the C-style solution is verbose simply because of the majority of operations in the for-loops. Sure, we can coalesce the loops through the loop fusion optimization but why aren't these operations provided by the standard library by default?</p>
<p>This problem can naturally be solved through higher-order functions which apply functions to each iteration through a loop.</p>
<blockquote><div class="icon idea"></div>
<p>C++ STL provides a notion of higher-order functions through their concept of function objects.</p></blockquote>
<h2>Underscorejs Solution</h2>
<p>The C-style solution is applicable to almost all programming languages that offer the standard set of control structures with <code>for-loop</code> and <code>if-else</code> though I do not apply it as frequently as is applicable. It's verbose. Without higher-order functions, I would implement it in JavaScript as I have above.</p>
<p>Fortunately, Underscorejs provides a standard set of higher-functions for reducing the verbosity of our code. We could simply use <code>_.filter</code> to retrieve the top scores, <code>_.max</code> to find the highest scores and <code>_.reduce</code> to find the running sum for the mean.</p>
<pre><code class="javascript">// Initial data
var scores = [98, 76, 45, 84, 32, 55, 88, 76];

// Calculates the top scores
var top_scores = _.filter(scores, function(score) {
	return score > 80;
});

// Calculate the highest score
var highest_score = _.max(scores);

// Calculate the running sum
var running_sum = _.reduce(scores, function(memo, score) {
	return memo + score;
}, 0);
var mean = running_sum / scores.length;

// Print the data
console.log("Top Scores: " + top_scores);
console.log("Highest Score: " + highest_score;
console.log("Average Score: " + mean);</code></pre>
<p>It's easy to see how our code verbosity has been significantly reduced through the use of the higher-order functions provided by Underscorejs. There are native implementations of the functions used above, namely, <code>Array.prototype.filter</code> and <code>Array.prototype.reduce</code>; nonetheless, I use Underscorejs for cross-browser compatibility.</p>
<h3>How Underscorejs implements Higher-Order Functions</h3>
<p>It may be obvious that these higher-order functions parallel to the C-style code above by applying a function to each iteration of a loop; still, I will make explicit its implementation. Moreover, I will add my personal annotations to the source code to help guide you through it.</p>
<p>Consider the implementation of <code>_.filter</code> which returns a new array of filtered values given a list (<code>obj</code>) and a boolean function (<code>iterator</code>). </p>
<blockquote><div class="icon idea"></div>
<p>Recall that a higher-order function either takes a function as an argument or returns a function; subsequently, <code>_.filter</code> is a higher-order function by definition of taking a function as an argument.</p></blockquote>
<pre><code class="javascript">_.filter = _.select = function(obj, iterator, context) {
	// The new results array to attempt immutability
	var results = [];
	
	// Edge case if the list is empty; simply return nothing
	if (obj == null) return results;

	// Use the native implementation if it's available.
	if (nativeFilter &#038;&#038; obj.filter === nativeFilter) return obj.filter(iterator, context);

	// Real work is delegated?
	each(obj, function(value, index, list) {
		// Returns the value if the boolean function evaluates to true
		if (iterator.call(context, value, index, list)) results[results.length] = value;
	});
	return results;
};</code></pre>
<p>First, let's consider all of the function cases:</p>
<h4>Empty List</h4>
<p>Returns a new, empty list. Trivial.</p>
<h4>Native Implementation Available</h4>
<p>It's worth noting here that <code>_.filter</code> utilizes the native JavaScript implementation as denoted by the <code>nativeFilter</code> case. If the native implementation is unavailable, we move on to support backwards-compatibility and cross-browser compatibility through this library.</p>
<h4>Default Case</h4>
<p>Delegates the action to another function, <code>_.each</code>. Well, we haven't seen a loop yet but we're about to. Let's consider the delegated action then.</p>
<pre><code class="javascript">var each = _.each = _.forEach = function(obj, iterator, context) {
	// Base case if the list is empty
	if (obj == null) return;

	// Use the native implementation if it's available
	if (nativeForEach &#038;&#038; obj.forEach === nativeForEach) {
		obj.forEach(iterator, context);

	// JS Hacking here to test whether obj can be accesed by a numerical index
	} else if (obj.length === +obj.length) {
		for (var i = 0, l = obj.length; i < l; i++) {
			if (i in obj &#038;&#038; iterator.call(context, obj[i], i, obj) === breaker) return;
		}

	// Default case for objects with generic keys
	} else {
		for (var key in obj) {
			if (_.has(obj, key)) {
				if (iterator.call(context, obj[key], key, obj) === breaker) return;
			}
		}
	}
};</code></pre>
<p>There's quite a few more base cases here so I'll truncate by skipping the base case and native implementation as I've already explained before.</p>
<p>Since the last two cases are similar, I'll aggregate them into the same case:</p>
<h4>Index Accessible &#038; Default Case</h4>
<p>This is where the real-action is. Simply, this case applies an arbitrary function to the list at every index.</p>
<p>This arbitrary function application allows us significantly reduce the amount of code we must write. The only caveat here is that the <code>_.each</code> function can mutate the original data structure which defies the immutability concept of functional programming.</p>
<h3>Simplifying the Data Flow through the Framework</h3>
<p>To best see the flow of operations through the framework, I've simplified the framework to remove unnecessary code for the purpose of this article.</p>
<pre><code class="javascript">var each = _.each = _.forEach = function(obj, iterator, context) {
	for (var i = 0, l = obj.length; i < l; i++) {
		if (i in obj &#038;&#038; iterator.call(context, obj[i], i, obj) === breaker) return;
};

_.filter = _.select = function(obj, iterator, context) {
	// The new results array to attempt immutability
	var results = [];
	
	// Real work is delegated?
	each(obj, function(value, index, list) {
		// Returns the value if the boolean function evaluates to true
		if (iterator.call(context, value, index, list)) results[results.length] = value;
	});
	return results;
};</code></pre>
<p>Writing your own set of higher-order functions should be just as simple.</p>
<h2>Conclusion</h2>
<p>As a prelude, there will be more to come on Underscorejs.</p>
<p>I've shown the prowess of higher-order functions which may apply to any language that implements them and it's always worth considering to frame these common operations under a common library to reduce verbosity in future code. Now that higher-order functions have been explained, I will demonstrate practical applications of the library.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/prelude-into-underscorejs-higher-order-functions.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MinDispatch: Event-Driven Framework in Java Part 1</title>
		<link>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-1.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mindispatch-event-driven-framework-in-java-part-1</link>
		<comments>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-1.html#comments</comments>
		<pubDate>Fri, 18 May 2012 10:11:26 +0000</pubDate>
		<dc:creator>Gio Carlo Cielo</dc:creator>
				<category><![CDATA[Innovation]]></category>
		<category><![CDATA[chat application]]></category>
		<category><![CDATA[event dispatcher]]></category>
		<category><![CDATA[event-driven]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[MinDispatch]]></category>
		<category><![CDATA[simulation]]></category>

		<guid isPermaLink="false">http://www.giocc.com/?p=979</guid>
		<description><![CDATA[We&#8217;ve gotten our feet wet with event-driven programming by developing a framework which controls the flow of data through our system. Effectively, I&#8217;ve made my framework available on Github for use by anyone: MinDispatch framework on GitHub. The Chat Application Simulation Revisited Our objective once again is to simulate the events of a chat application [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve gotten our feet wet with event-driven programming by <a href="http://www.giocc.com/writing-an-event-driven-framework-with-java.html">developing a framework</a> which controls the flow of data through our system. Effectively, I&#8217;ve made my framework available on Github for use by anyone: <a href="https://github.com/Hydrotoast/MinDispatch/tree/master/edu/giocc/MinDispatch" title="MinDispatch">MinDispatch framework on GitHub</a>.</p>
<p><span id="more-979"></span></p>
<h2>The Chat Application Simulation Revisited</h2>
<p>Our objective once again is to simulate the events of a chat application by explicitly specifying a set of events.</p>
<p>However, we will no longer define the flow of data through our system for we have a framework which handles this. Instead, we can simply define the set of events and processes that our system must simulate.</p>
<h3>Events of a Chat Application</h3>
<dl>
<dt>User Arrival</dt>
<dd>Occurs when a user arrives to a room.</dd>
<dt>User Departure</dt>
<dd>Occurs when a user departs from a room.</dd>
<dt>User Message</dt>
<dd>Occurs when a user sends a message to a room.</dd>
</dl>
<p>We want to process such events by simply printing out the standard output.</p>
<pre><code class="no-highlight">foo has entered the room.
bar has entered the room.
foo: hello, bar!
bar: hello, foo!
foo: goodbye, bar!
foo has left the room.</code></pre>
<p>Through the observations above, we should be able to quickly design our program in two steps: model the events and process the events. Finally, we can dispatch events to test our design.</p>
<h3>Prerequisites</h3>
<p>First, you must <a href="https://github.com/Hydrotoast/MinDispatch/tree/master/edu/giocc/MinDispatch" title="MinDispatch">download the MinDispatch source</a> and package it with your current project which we will name <strong>ChatEventMachine</strong>.</p>
<h2>Coding the Simulation with MinDispatch</h2>
<p>We modeled our events and now we must consider how they will be routed by the event dispatcher of our framework.</p>
<div id="attachment_1044" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/MinDispatcher.png"><img src="http://www.giocc.com/wp-content/uploads/MinDispatcher.png" alt="Event Dispatcher" title="Event Dispatcher" width="360" height="227" class="size-full wp-image-1044" /></a><p class="wp-caption-text">Event Dispatcher</p></div>
<p>Since our dispatcher on our framework is responsible for routing events to their handlers, we start by identifying the events necessary as we have above. Afterwards, we simply need to register the events to a respective set of event handlers which process each event accordingly.</p>
<h3>First: Model the Events</h3>
<p>Let&#8217;s begin with modeling the events specified:</p>
<pre><code class="java">public class ChatEventMachine {
	private static class User {
		public String name;

		public User(String name) {
			this.name = name;
		}
	}

	private static class UserArrival extends Event {
		public User user;

		public UserArrival(User user) {
			this.user = user;
		}
	}

	private static class UserDeparture extends Event {
		public User user;

		public UserDeparture(User user) {
			this.user = user;
		}
	}

	private static class UserMessage extends Event {
		public User user;
		public String message;

		public UserMessage(User user, String message) {
			this.user = user;
			this.message = message;
		}
	}
}</code></pre>
<p>In our code above, we use an auxilliary <code>User</code> class which helps us encapsulate data which is only associated to the user such as the user&#8217;s name.</p>
<blockquote><div class="icon idea"></div>
<p>User classes frequently associate with many more properties such as a unique user id, password, email and more depending on the application.</p></blockquote>
<p>Each of our events provide sufficient data for their handlers to use. The constructors of the new events, though unnecessary, makes dispatching easier as will be seen later in this tutorial.</p>
<h3>Second: Route Events to Handlers</h3>
<p>After our events have been successfully modelled, we can easily route events to handlers through the event dispatcher within our framework.</p>
<pre><code class="java">public class ChatEventMachine {
	// Event models

	public static void main(String[] args) {
		EventDispatcher dispatcher = new EventDispatcher();

		dispatcher.registerChannel(UserArrival.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserArrival arrival = (UserArrival)evt;

				System.out.println(arrival.user.name + " has entered the room.");
			}
		});

		dispatcher.registerChannel(UserDeparture.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserDeparture departure = (UserDeparture)evt;

				System.out.println(departure.user.name + " has left the room.");
			}
		});

		dispatcher.registerChannel(UserMessage.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserMessage message = (UserMessage)evt;
				String userMessage = String.format("%s: %s", message.user.name, message.message);
				System.out.println(userMessage);
			}
		});
	}
}</code></pre>
<p>In the code above, we simply register a set of channels by mapping an event subclass to a new handler through a <code>HashMap</code> according to our framework implementation. Note here that I simply create a new, unique event handler for each event since each event requires a unique process.</p>
<h3>Third: Dispatch Events to Simulate</h3>
<p>Since our event dispatcher has now been set up at this point, we can start dispatching a set of events to simulate a chat application. This marks the beginning of control starting from the event dispatcher.</p>
<div id="attachment_1045" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/MinDispatcherControl.png"><img src="http://www.giocc.com/wp-content/uploads/MinDispatcherControl.png" alt="Dispatcher with Control" title="Dispatcher with Control" width="360" height="227" class="size-full wp-image-1045" /></a><p class="wp-caption-text">Dispatcher with Control</p></div>
<p>In this simple example, I will hard-code dispatched events by manually writing the constructors for the necessary objects and dispatching them through the framework&#8217;s dispatcher.</p>
<pre><code class="java">public class ChatEventMachine {
	// Event models

	public static void main(String[] args) {
		// Dispatcher and handler definitions

		User foo = new User("foo");
		User bar = new User("bar");
		dispatcher.dispatch(new UserArrival(foo));
		dispatcher.dispatch(new UserArrival(bar));
		dispatcher.dispatch(new UserMessage(foo, "hello, bar!"));
		dispatcher.dispatch(new UserMessage(bar, "hello, foo!"));
		dispatcher.dispatch(new UserMessage(foo, "goodbye, bar!"));
		dispatcher.dispatch(new UserDeparture(foo));
	}
}</code></pre>
<p>At this point, our chat application simulator is complete. When we execute the main method of this program, we achieve the following: </p>
<pre><code class="no-highlight">foo has entered the room.
bar has entered the room.
foo: hello, bar!
bar: hello, foo!
foo: goodbye, bar!
foo has left the room.</code></pre>
<p>The full source at the end of our tutorial:</p>
<pre><code class="java">public class ChatEventMachine {
	private static class User {
		public String name;

		public User(String name) {
			this.name = name;
		}
	}

	private static class UserArrival extends Event {
		public User user;

		public UserArrival(User user) {
			this.user = user;
		}
	}

	private static class UserDeparture extends Event {
		public User user;

		public UserDeparture(User user) {
			this.user = user;
		}
	}

	private static class UserMessage extends Event {
		public User user;
		public String message;

		public UserMessage(User user, String message) {
			this.user = user;
			this.message = message;
		}
	}

	public static void main(String[] args) {
		EventDispatcher dispatcher = new EventDispatcher();

		dispatcher.registerChannel(UserArrival.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserArrival arrival = (UserArrival)evt;

				System.out.println(arrival.user.name + " has entered the room.");
			}
		});

		dispatcher.registerChannel(UserDeparture.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserDeparture departure = (UserDeparture)evt;

				System.out.println(departure.user.name + " has left the room.");
			}
		});

		dispatcher.registerChannel(UserMessage.class, new Handler() {
			@Override
			public void dispatch(Event evt) {
				UserMessage message = (UserMessage)evt;
				String userMessage = String.format("%s: %s", message.user.name, message.message);
				System.out.println(userMessage);
			}
		});

		User foo = new User("foo");
		User bar = new User("bar");
		dispatcher.dispatch(new UserArrival(foo));
		dispatcher.dispatch(new UserArrival(bar));
		dispatcher.dispatch(new UserMessage(foo, "hello, bar!"));
		dispatcher.dispatch(new UserMessage(bar, "hello, foo!"));
		dispatcher.dispatch(new UserMessage(foo, "goodbye, bar!"));
		dispatcher.dispatch(new UserDeparture(foo));
	}
}</code></pre>
<h2>Conclusion</h2>
<p>I was able to write this code within ten minutes; consequently, we see the power of a simple, event-driven framework, namely, MinDispatch, in quickly modelling event-based system and handling it accordingly. </p>
<blockquote><div class="icon idea"></div>
<p>A few people may notice that this approach is capable of modeling various live systems such as multiplayer gaming where a set of players may emit events to interact with the world e.g. opening chests, communicating with others and fight sequences.</p></blockquote>
<p>I purposely ommitted a parse file or dynamic input stream; however, I plan to continue this tutorial by extending our application to include a dynamic stream of events as shown below: </p>
<div id="attachment_1066" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.giocc.com/wp-content/uploads/MinDispatcherWithQueue.png"><img src="http://www.giocc.com/wp-content/uploads/MinDispatcherWithQueue.png" alt="Dispatcher With Event Queue" title="Dispatcher With Event Queue" width="360" height="449" class="size-full wp-image-1066" /></a><p class="wp-caption-text">Dispatcher With Event Queue</p></div>
<p>With the above diagram as a treat for the next post in this series, please look forward to reading about building a simple AI to respond to chat events!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.giocc.com/mindispatch-event-driven-framework-in-java-part-1.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
