The Tactical Amulet Extraction Bot: taeb

Showing posts with label taeb. Show all posts

Saturday, August 22, 2009

Planar - a TAEB AI

TAEB::AI::Planar is a NetHack AI, written as a plugin for TAEB. (TAEB the framework does not contain AI-specific logic itself; instead, it aims to be a general platform that can be used both by AIs and by other programs that want to interact with NetHack; for instance, it would be plausible to use TAEB to implement an Interhack-like program.) The first TAEB AI, historically, is the one that is now called Behavioral (and which was originally just part of TAEB itself before it was split out); most TAEBs that you will see on nethack.alt.org, for instance, run that AI. Planar is a newer attempt to create an AI, by me, Alex Smith (ais523) and Stefan O’Rear (sorear); at present, it’s used by the bots TAEB523 and sortaeb523 (which run from much the same codebase, but are generally configured differently). Planar’s best score so far is 34179, in the large room at the end of the second level of Sokoban, which was scored locally on my computer; it died walking over a cockatrice corpse when blind and gloveless, the dangers of which it hadn’t been told about.

Planar treats NetHack as a pathfinding exercise; just as a computer would use a pathfinding algorithm to determine the shortest route from one room to another, Planar uses pathfinding algorithms to determine the best ‘route’ from its current situation to its eventual goal. The AI is based around resources, which are things that the AI can spend, can use, or has to be careful not to run out of (for instance, gold or hitpoints), threats, which are things that can be mitigated or fixed, but could cause plans to be more dangerous or which prevent them working correctly if not fixed (for instance, monsters or a trap on the current square), and plans, which are specific goals that it aims for, along with an idea of how to accomplish them.

Here is how Planar goes about deciding what to do on any given turn. (Note that although Planar remembers what it was thinking on previous turns, it recalculates most things every turn to allow for new information, rather than blindly continuing with its previous plan.) Each stage of the calculation is accompanied by a picture showing an actual situation from a game of NetHack that Planar has played (in fact, they’re all from the same game), and the text explains how Planar went about dealing with it.

Threats

The ‘start’ of Planar’s main loop (that is, when it starts dealing with the current turn, rather than the previous turn), is checking for threats that might make plans more risky. In the picture above, TAEB (the @ sign which appears inverse-video in this screenshot because it has a blinking cursor on it) has decided that it wants to eat the corpse (the % sign) on the ground, probably because it’s relatively hungry and wants to get to the corpse before it rots. Although in theory it could walk directly to the corpse and eat it, there are two potential problems; to the west is a nymph (a lowercase n) who would steal its items if given a chance, and to the east is a golem (an apostrophe) who, being a hostile monster, wants to beat the poor TAEB up. (Incidentally, Planar always asks for more information, if it’s available and doesn’t cost a turn, in order to analyse the situation as well as possible; for instance, if the material the golem was made of (relevant because it determines how dangerous it is) were undeducible from its colour on-screen, it would ask the framework to request NetHack to give more details, in this case by sending a farlook command to examine it remotely in detail.) The two monsters here are threats; the nymph is a threat to the TAEB’s items (which will be a source of resources such as nutrition, damage-potential, and armour), whereas the golem will be a threat to the TAEB’s hitpoints (a resource in themselves). Note that all threats, as well as everything else in Planar, are quantified so that it knows their exact value. (There are two ways to represent the value of things inside Planar; one is as a list of amounts of resources, the other is as a number that attempts to calculate the total value of all the resources in question by using exchange rates. These alter according to the amounts of the resources in question; for instance, normally one hitpoint is considerably more valuable than one unit of nutrition, but that rate would be rather different if the TAEB were healthy but fainting from hunger.) In this case, it would estimate the amount of theft the nymph would likely carry out (and as any NetHack player knows, that’s generally more than you can easily replace); and it would calculate the amount of damage the golem could deal per turn in mêlée combat (taking the worst-case scenario; Planar is just as scared of the Random Number Generator as any human would be).

The threat-check will quickly try to work out how quickly, on average, the monsters could get to each square on the map, and assign a threat function to those squares that indicates how damaging the threats could be if we tried to route to those squares at a particular moment. This is shown in the tactical map above; colder colours represent safer squares, with warmer colours being squares that are more threatening. So, for instance, going to the west past the nymph without dealing with it first is likely to be suicidal, thus the whole area there is filled with magenta (the worst possible colour); likewise, going to the east past the golem without dealing with it first is also a bad idea. The amount of threat that actually matters depends on when we finish business on the square in question; the map above shows the amount of threat (together with the amount of tactical cost, which is insignificant on that diagram) on each square at the first moment we can reach it, but carrying through time-consuming plans on those squares would give the monsters more time to catch up. As a result, the nearest unexplored corridor is a safe blue, as the TAEB could outrun the monsters along it (golems are rather slow, and the TAEB has a head-start over the equally speedy nymph); but the second-nearest unexplored corridor ends in a red square, because although the TAEB could get away from the golem eventually, the golem would get a few hits in as it walked past. So, if it was going exploring, it would use the safer first path.

However, just now the TAEB isn’t exploring; it’s hungry. The corpse shows up as brown on the above map, but that map (which was generated from the tactical planning stage) is just taking into account the length of time the corpse takes to walk to; eating the corpse will also take some time, and the threat function will give a higher threat value for a plan involving eating the corpse once the time taken is known (which will be calculated in the strategic planning stage).

There’s one other important task carried out by threat checking; for each square on the map (to be precise, the current level; as routing for other levels is cached, any threat calculation for them would be ignored anyway), it tags the square with information on how to mitigate the threat or threats that can get there. So for instance, there will be a plan to Mitigate the golem (i.e. kill it, eliminate it, or drive it away somehow) in order to make it safer to eat the corpse. (After it’s killed the golem, it’ll likely see that the nymph is also problematic, and it will try to get rid of her too, probably by throwing sharp pointy objects at her. This is another outcome of the threat check; it’s rather dangerous to attack a nymph in mêlée if you can help it, and the squares further away from her would show a much lower risk as a result, meaning that projectiles would be favoured over melee weapons.)

The bottom of the screen shows the decisions that the other stages of the loop have made; strategic planning has decided that it’s still worth eating the corpse, that getting rid of the golem first is a good idea, and that mêlée is the best way to do it; and tactical planning has decided that the best way to route near the golem is to walk east of it until the square immediately to its west. The chain of plans involved can be seen on the penultimate line of the screen; the final (l) on the screen is added by the framework, which knows that pressing l is the correct way to walk to the east (whereas the AI would just have returned a “walk east” action to the framework).

Tactical planning

Tactics

Once all the threats have been placed on the map, the next step is to work out tactics for the turn. Again, this is done by working out how expensive (in terms of resources) it is to move to any given square on the map; but this time, it’s our own routing we worry about, rather than that of the monsters.

The diagram above shows our TAEB just after it’s killed another monster, again wanting to eat it. This time, the monsters around are less threatening; there’s just an iguana (the nearby colon) to worry about. Although this particular turn came to the same conclusion as the last one analysed (kill the monster so you can eat the corpse in peace), tactical planning does not care about what the TAEB eventually ends up doing, but rather, how best to get to each square on the map.

This is done in terms of tactical plans; a single tactical plan contains a suggestion of how to get from one square to another square, together with information on how risky it would be (taken from the threat map and from information on the cost of the action required). Plans in Planar are very specific; as opposed to Behavioral’s behaviours, which are few in number and general (such as FixStatus and Defend), a plan in Planar contains information about exactly what is to be done (such as ‘move from (12,10) to (13,10) by opening the door on (13,10) then walking there’, which would have a cost in time based on how much time it would likely take to force the door open). It would be entirely common for multiple plans to be considered; for instance, in that example with the door, Planar would also be considering kicking that door down, which could be faster if, say, it were approaching the door from the diagonal (as a broken doorway can be entered diagonally, but one with a door in can’t be).

Tactical planning is a form of routing, in physical NetHack space; however, unlike most other AIs, a large amount of AI logic is done during tactical planning. For instance, Behavioral has a behaviour to open doors so that they don’t block routing later; in Planar, opening doors is only done if they’re in the way. Likewise, Planar will push boulders (trying not to block corridors in the process) and even tunnel through walls if that’s necessary to go where it wants to go, or if that’s the fastest way; the tactical planning stage will have a plan all ready for any strategic plan that needs to move to the square, for whatever reason. The actual routing is done via Dijkstra’s algorithm, in order to gain routing information for the whole map at once.

The diagram above shows routing costs, as determined by tactical planning, to the whole level; as opposed to the previous example, TAEB523 now has a pickaxe, and thus can route into the walls if it wants to. Note that many of the nearby walls are considerably more expensive to route to than the corridors and rooms near them; but the distant walls have comparable costs to the nearby rooms. The reason for this is that digging nearby takes a sufficiently long time that the iguana will catch up to the TAEB and attack; but digging distantly would happen in the future, after the iguana had already been outrun.

At the end of the tactical planning stage, therefore, it’s possible to quickly and efficiently answer any query from the strategic planning stage about how to get somewhere, and how risky doing so would be. (Incidentally, in Planar, ‘risky’ is a technical term meaning ‘difficult, expensive, or both’.) As a result, huge numbers of strategic scenarios can be tried, to determine which one works best.

Strategic planning

Strategy

Strategic planning is the heart of Planar; threat checks and tactical planning are simply calculating information so that strategic planning can refer to it quickly, whereas strategic planning decides what, overall, the bot is actually going to do.

To start with, there are several standard plans that work on improving Planar’s resource situation; for instance, eating when hungry, collecting ammo, and picking up useful items. (An item is useful enough to be picked up if its value to us, say as a source of food or as a backup armour in case the one we’re already carrying around turns out to be cursed, is higher than the drawbacks, such as its weight.) This gradual improvement of the character is known as resource conversion; if the TAEB can manage a net gain in resources, it does so, ignoring the main plan for a while (and going for the largest gains first). Therefore, there’s no need for every major plan to take the details of things like feeding and shopping into account; it’ll happen automatically without a specific request to suppress it. If none of the resource-conversion plans can produce an improvement, though, the main plan takes over; this is a user-configurable overall goal that the bot is trying to accomplish. In the example above, for instance, the main plan is SlowDescent, which explores the dungeon from the top down, thoroughly exploring each level before going onto the next one.

In order to determine how to accomplish the goal in question, strategic planning basically does routing in plan-space, again using Dijkstra’s algorithm. In this case, the steps on the route are plans that make other plans possible. So, for instance, the chain of plans shown in the picture above are SlowDescent | by exploring [this level] | by exploring [a specific square] – by walking towards it; the final plan there is a tactical plan, the rest are strategic. In order to come to this decision, Planar will start by asking SlowDescent if it can be accomplished directly; the answer in this case is always ‘no’, because there is no associated action (it’s a “metaplan”, which is defined in terms of plans rather than in terms of actions). As a result, it’ll look for plans that make up parts of the slow descent, or make it easier to accomplish; in this case, the plans in question will be exploring level 1, exploring level 2, exploring level 3, and exploring level 4, in that preference order. (There’s no problem with more than one plan having the same preference; in that case, Planar would simply take the cheapest, leading to a “nearest first” exploration of the dungeon. However, SlowDescent is specifically trying to explore from the top downwards.)

Each of the level exploration plans are likewise metaplans, which suggest exploring particular locations on the level in question. Although Planar would like to explore on the first or second floors, if possible, those floors were apparently explored out at the time; so as the next-best option, it would consider the exploration spots on the third floor. (The picture above is showing the cache of exploration spots on the level, cached for speed; warm colours show squares in need of exploration, cool colours showing squares that have already been explored.) This time, all the exploration spots have the same preference level; so the least risky exploration is taken, which in this case, with no visible threats, is the nearest. (The tactical planning stage will already have the risk of reaching each of the exploration spots, as well as everywhere else in the dungeon, ready-calculated so that it can be given very quickly when requested; so despite the large number of exploration spots shown, strategic planning will not take much time.)

Before enacting the plan in question, various checks are done. Plans are skipped if they require more of any resource than is available to spend; so, for instance, exploring one square is normally fine, but if that square is next to a demon prince and we’re low on hitpoints, it will be considered far too risky in terms of hitpoints to attempt. (A different plan would be favoured, therefore; probably running away to carry out business in another part of the dungeon.) Likewise, something can be too risky in terms of anything else. The alternative plans suggested by threat check will also be considered, to see if eliminating the threat first and then carrying out the plan will be a better idea than just carrying out the plan directly. A plan can be suppressed by the previous turn’s plan, to avoid losing sight of a slightly-longer-term goal in favour of a short-term goal (so, for instance, it will unequip a shield in order to wield a two-handed digging tool, even though normally improving its armour would be more urgent than digging through a wall). Finally, a plan might not be enacted due to success measurement deciding that it is doomed to fail, or create an oscillation, based on observations rather than on known information.

Success measurement

Success

Once Planar has decided on a plan, it will find out what action is associated with it, and tell the framework that that’s the desired action for the turn. After the framework has processed the results of the action and asked for the next one, though, Planar does after-the-fact processing on the current turn before moving on to consider the next turn’s plan. In particular, it’s trying to determine whether the plan worked as expected or not, and whether it’s stuck in an oscillation.

Each plan comes with a test to determine whether or not it accomplished what it set out to do. For instance, a plan to open a door will check to see if the door is now open; if it isn’t, it will check to see if it was for a known reason (doors in NetHack sometimes randomly fail to open, which is not a problem), or if it was for an unknown reason (for instance, Planar doesn’t yet know it can’t open a door if it’s been forced into jackal form by a werejackal, but it can determine that it tried to open a door, it didn’t work, and it doesn’t know why). If a plan fails unexpectedly, it won’t be retried for some time, which increases sharply the more times the plan in question has failed in a row. (There’s an exception to this; if something happens to make Planar think that the situation has changed, for instance if a tile relevant to the plan in question changes glyph or one of the plans marked as part of or an alternative to the plan succeeds, it will try again instantly.) This happens for both strategic and tactical plans; as always, failure will simply cause it to seek alternatives. In this way, Planar can get around deficiencies in either its own, or the framework’s, understanding of NetHack; this ability to determine what works by experiment will helpfully aid it in handling unexpected situations (whereas Behavioral normally responds to such situations with an infinite loop as it tries a doomed action over and over again).

More subtly, a plan can seem to be working correctly, but leave TAEB stuck in an infinite loop; this is the dreaded “oscillation” that’s so common in NetHack-playing AIs. The picture above shows TAEB near a throne room (full of letters representing monsters); it looks like a good place to explore from afar, but upon approaching more closely, it’s possible to see that it’s full of monsters. Planar tends to be cautious around monsters, especially when there are alternative plans available; so it would walk away from the throne room, forget the location of the monsters (neither the TAEB framework nor Planar currently has any way to track the locations of monsters that are out-of-sight), walk towards it again, and repeat. The repeated indecision in which plan to accomplish is detected in success measurement; and as a result, the success measurement deliberately makes up its mind to try one plan or the other, by temporarily interfering with both the strategic planning and tactical planning stages to help force a particular decision. In the map above, therefore, most of the routing map is unexpectedly grey, rather than the usual blue; this is because success measurement has detected an oscillation and decided to allow only plans that move towards the throne room for a few turns (by causing routing to fail when moving away from it). As a result, the TAEB has decided to look at the items on the floor in the room instead; and has concluded that the safest way to do that is to slaughter the inhabitants of the throne room, picking them off at range using projectiles. (The messages at the top show that the previous turn, it threw something at a monster and killed it; it wants to do that again, but needs to reposition itself first.)

Sunday, June 7, 2009

Anatomy of a Step

Several people have asked to see more articles about TAEB's architecture. Even after working with the codebase for a year and a half, I am still very happy with the design. My previous attempts at writing a NetHack bot lasted only a month or two before collapsing under their limitations. However, I think we finally got it right with TAEB, so I enjoy sharing what we have created.

To introduce the important components of TAEB, I am going to walk through TAEB's "main loop". We call each iteration of the main loop a "step". Each step vaguely resembles iterations of NetHack's own main loop. Note that this is completely unrelated to NetHack's turn counter. Due to the speed system, command repetition ("search for twenty turns"), and even paralysis, steps do not correspond with turns. The only similarity is that both counters increase monotonically. Today, we care only about steps, not turns.

Input

The first thing TAEB does is read input from NetHack. This can involve reading from a socket (as in TAEB::Interface::Telnet) or reading from a pseudo-terminal (as in TAEB::Interface::Local). I will eventually write a post about the mechanics of simply deciding when NetHack is done sending data. It is not at all trivial. (update: here's that post)

NetHack prints a stream of characters. Since it provides a full text user interface with two-dimensional maps and colors, NetHack prints escape sequences. These escape sequences encode commands such as "change the pen color to red" (\e[31m) or "go to cell (5, 12)" (\e[12;5H). We run all input through Term::VT102 to parse and handle these escape sequences. Term::VT102 then lets us ask high-level questions like "what is the text of row 23?" or "what color is cell (59, 6)"? We have a subclass TAEB::VT that lets us ask additional questions, such as "is the text 'Hit return to continue:' present on the terminal?"

We now have something resembling what the player would see (a two-dimensional, colored block of text). The next task is to understand what is on the screen. Consider the following NetHack terminal:

As a human, you can figure out a lot of what is going on, even if you have never played NetHack. You can see that the player is in combat with a goblin. From the bottom lines, you can guess that the player is stuffed with food, and probably the character's name. If you have played NetHack, you can recognize several more things. The character is a wizard (evidenced by the Evoker title). There is a general store in the top left room. The character is currently on the bottom of the screen, with a goblin just north of him. You can identify the seven rooms of the level, and the features of the dungeon (like stairs, a fountain, and many doors).

Such analysis is the job of two components, TAEB::ScreenScraper and TAEB::World::Cartographer. The Cartographer analyzes the map to populate internal level and tile information, while the ScreenScraper handles other input (mostly English text). The ScreenScraper runs first, parsing messages to publish announcements. The rest of the system does not have to worry about the content of messages, each component just listens for particular high-level events.

Thankfully, English text appears in fixed places on the screen; TAEB never has to guess which text is prose and which represents the map. Most of the time, text only appears on the top and bottom lines. However, menus can appear in more places on the screen.

Here we have an "identify" menu. The ScreenScraper can detect that there is a menu by looking at the text preceding the cursor. If it is "(end)" or "(# of #)" then TAEB knows a menu is on the screen.

Menus are interesting because they demand input from the rest of the system. The ScreenScraper does not know what items should be identified — it has to ask the AI component. Since there can be many menus in a particular step (such as identifying several items, one at a time), the ScreenScraper has its own loop for parsing different kinds of input. The various kinds of input include menus, prompts ("Eat what?"), location requests ("To what position do you want to be teleported?"), and ordinary top-line messages.

The ScreenScraper and NetHack may communicate several times before the next step is started. What is vital is that the ScreenScraper leaves NetHack in "action mode". There are no unresolved prompts, menus, --More-- messages, etc. This means that the Cartographer can assume that the cursor is on TAEB. This invariant makes TAEB cope perfectly with having a different character glyph due to polymorph or invisibility. Other bots (such as "moomaster") have struggled with this, since they guessed that the white @ on the screen was the character.

Since the ScreenScraper leaves NetHack in action mode, the Cartographer knows that every character between lines 2 and 22 inclusive are the map. If there were a menu on screen, TAEB would get massively confused, thinking there are suddenly hundreds of monsters floating in a void on the right side of the screen.

The Cartographer looks at each cell on the virtual terminal to update each tile in TAEB's internal map. It uses cell glyph and color to determine what is on each tile. For example, a gray { is a sink.¹ The sink is mapped to a TAEB::World::Tile::Sink object. This sink object of course knows things every tile knows, such as how many times it has been stepped on by the character, what items are on the tile, and what engraving is on the tile. In addition, sinks know whether they have produced a black pudding, ring, or foocubus through kicking.

One curiosity is that the map is updated after we publish messages from the ScreenScraper. If TAEB moves and receives the message "You see here an opal ring", the ScreenScraper sees that and announces that there is an opal ring on the current tile. However, since the Cartographer has not run, the current tile has not been updated yet, so the item would be incorrectly added to the previous tile. The crude solution we currently use is to freeze the publisher until after the map has been updated. We are not yet sure of a better solution to this.

Output

At this point, the map has been updated, all input from NetHack has been parsed, and the appropriate announcements have been published. We then redraw the screen for the users watching the bot play. We also check if the user typed a key. TAEB has many debug commands, such as ~ to open up an interactive REPL to inspect and change TAEB's internal state.

Finally, it is time to involve the AI. TAEB simply asks the AI "what now?" This can involve arbitrarily complex calculations. One of my favorite features of TAEB's design is that its AI is pluggable. TAEB requires only a few methods be implemented by the AI, and we are working to trim that down to just "what now?"

There are currently two AIs being developed: Behavioral (by most of the core team) and Planar (by ais523). Both are worthy of many posts. We hope to entice more developers to work on TAEB AIs! One of our future projects will be Interhack on top of TAEB. The intelligence of a TAEB::AI::Interhack would not actually be "artificial", of course.

Since TAEB is a framework, the AI will inevitably be tightly-bound to TAEB (though, as explained, the converse is not true). The AI is expected to call all sorts of methods on TAEB objects, such as "am I currently blind?" (TAEB->is_blind), "do I know the identify spell?" (TAEB->find_spell("identify")), "is there a fountain on this level?" (TAEB->current_level->has_type("fountain")), and so on. We think this is a huge boon to developers seeking to write NetHack bots — they can focus solely on the interesting bits of artificial intelligence. We handle the programmatic NetHack.

The AI tells TAEB what to do each step through TAEB::Action objects. These abstract away the details of interacting with prompts and menus. The AI can say:

    sub next_action {
        # ...
        if (my $lizard_corpse = TAEB->find_item("lizard corpse")) {
            return TAEB::Action::Eat->new(
                food => $lizard_corpse,
            );
        }
        # ...
    }

The action will take care of responding to the "Eat what?" prompt with $lizard_corpse's inventory letter. The action will also respond to unexpected prompts, such as "There is a fox corpse here; eat it?". This declarative nature pervades TAEB's design to great effect.

Finally, we send the keystrokes for the current action (such as "e" for eat) to NetHack.

The action will participate in the next iteration of the main loop to respond to prompts and listen for events caused by the action. For example, if TAEB receives the message "You stop eating the lizard corpse", then the action will know not to remove the lizard corpse from the inventory.

Conclusion

The design of TAEB's main loop has had a significant impact on its (relative) longevity. The design of handling all of the input from NetHack as soon as possible drove much of the system's design in very positive ways. It encouraged the reification of Actions, which has been immeasurably useful. It even encouraged the addition of the pubsub system and its later enhancement to announcements. Though there are still some lingering quirks caused by this main loop design (such as teleportation traps), none seem insurmountable.

Footnotes

TAEB remaps some dungeon features to avoid conflicts; in core NetHack, sinks are gray #, but so are corridors. Eventually we may be able to remove some remappings due to increasing Cartographer sophistication, but there's no rush. TAEB does not change too many characters as to render its NetHack screens unfamiliar. (back)

Saturday, March 14, 2009

TAEB 0.03

registering upload with PAUSE web server
POSTing upload for TAEB-0.03.tar.gz
PAUSE add message sent ok [200]

This is the first version that is actually going to be on the CPAN. I decided to put up with Module::Install for a little while longer. :)

TAEB, pubsub, and announcements

TAEB is a decently-sized, componentized program. Components need to be able to communicate with each other. For example, the Senses component (which tracks the state of TAEB's character) needs to know when we are indebted to a shopkeeper so that it can respond to the AI when it asks if TAEB is in debt, and for how much. The Cartographer component (which tracks the state of the dungeon map) needs to know when TAEB becomes indebted so it can mark the current room as a shop. In general, we want any component to be able to listen for any update. This will let us remain flexible and extensible; the completely-separate AI can listen for any update that the framework can.

We use the publish-subscribe pattern to control this complexity. Pubsub decouples those who generate updates (publishers) from those who listen for updates (subscribers). TAEB has been using pubsub for a long time now (since 2008-01-06, according to darcs trackdown). To publish a message, anything in the program can call TAEB->enqueue_message("foo" => @arguments). This will call the msg_foo method on all subscribers with arguments @arguments. We have a component, Publisher, that acts as message broker.

Pubsub has been a great tool for letting TAEB understand messages from NetHack. We have a component that is devoted entirely to figuring out what the current state of the screen is telling us: the ScreenScraper. The ScreenScraper deals mostly with transforming characters on the screen into published messages. For example, when the NetHack prints the message "You owe (somebody) (some amount of) zorkmids.", the ScreenScraper publishes a debt message with one argument: the number of gold pieces TAEB owes. The Senses has a msg_debt method that stores this amount of debt in an attribute. The Cartographer has a msg_debt method that floodfills the current room's tiles with the shop bit. The ScreenScraper does not need to know who cares about debt. The ScreenScraper publishes a lot of messages that no component subscribes to yet, and that's okay!

I'm happy with this, but there is room for improvement. Simple methods are not great message handlers. There are painful bugs lurking when subscriptions interact with inheritance. Suppose the AI defines a generic "go to a specific tile" behavior. This behavior would need to handle walked so it can track TAEB's progress. This GotoTile behavior is then subclassed to produce behaviors such as GoUpstairs and GotoCorpse. GotoCorpse might need to handle walked so it can age the corpse (so that TAEB doesn't eat rotten corpses). When the publisher sends the walked message to the GotoCorpse behavior, it does not invoke GotoTile's msg_walked method, because GotoCorpse didn't invoke it. While you may argue that GotoCorpse should have known to invoke GotoTile's msg_walked as that is part of its public interface, it certainly sucks for usability. It's especially painful when GotoTile begins subscribing to walked months after GotoCorpse is written!

The potential fixes for this are easy. The one I like best is providing some "subscribe to a message" sugar. Where we previously wrote sub msg_walked { ... }, we would now write message walked => sub { ... }. This sugar would handle publishing to parent classes if there are any. It would still use method calls behind the scene; message would install a msg_walked method. This gives us maximum flexibility. A subscriber could define an unusual message that would only conditionally be published to its parents.

The arguments we pass to each method could be improved as well. It's not immediately apparent what arguments the msg_debt method would receive. Currently we pass the amount of debt. However, we should also be providing the name of the shopkeeper TAEB is indebted to. This would help the Cartographer resolve ambiguities when there are multiple rooms and shops in sight. Should we rewrite every msg_debt method (including those potentially written by third-party, unknown AI hackers) to take named parameters? Should we pass in the shopkeeper's name as the second parameter? Should we pass in the name as the first parameter? After all, the name does come first in the message NetHack prints. We could provide one positional parameter (amount) and one named parameter (shopkeeper). Hey wait, it would be useful to also pass which items we're buying as well...

The best answer is to make each message an object. We would have a TAEB::Message::Status::Debt class with attributes amount and shopkeeper. This class could have a method to ask the inventory what items are currently in TAEB's shopping cart. If we're feeling cute, we could even overload this message to stringify to the amount of debt.

Pubsub with objects as messages is generally called Announcements. The concept of announcements is more general than pubsub. Announcements lets subscribers of a message communicate with the other subscribers, and even with the publisher.

Currently, when NetHack presents TAEB with a menu to select which items to pick up, TAEB will ask the AI whether it wants to pick up each item. The AI is asked about the item without the context of the other items that it can pick up. This really sucks! If TAEB is toeing the burden line, it needs to be pretty strict about what items it will pick up. This means it may refuse to pick up a useful item because there just might be an even more useful item further down the list.

Instead, TAEB should publish a "pick up items" announcement. This announcement, TAEB::Message::Query::PickupItems, would have the list of items. Subscribers would select which items they want by invoking methods on the announcement. When all subscribers have had a chance at it, the ScreenScraper would select the items in NetHack's menu accordingly. The selection doesn't have to be binary either; each subscriber could assign a numeric "desire" to each item. The ScreenScraper would then select items that have a sum desire greater than some cutoff (which would be another decision that some subscriber could set). Using announcements would better decouple the AI from the framework.

This is not the first time that turning plain strings into classes has been a major improvement for TAEB. Previously, the AI would return a string, a NetHack command, as the action to perform next. We then reified actions into classes. Letting the current action subscribe to messages, and respond to NetHack prompts, vastly improved TAEB's interactions with NetHack. Problems with unexpected prompts (such as "Eat this corpse on the ground?") quickly and completely vanished. Actions subscribing to messages lets us handle ambiguous messages better; if we just applied a unicorn horn, then we can figure out that "You feel sick." means the unicorn horn was cursed and TAEB can mark it so.

TAEB does not have announcements yet, but that's my next big project. I'm excited by the many possibilities here.

Sunday, March 1, 2009

Where's 0.01?

Due to an issue with Module::Install, the 0.01 tarball I uploaded to CPAN last night was empty. I've long disliked Module::Install, so I'm going to get rid of it in favor of a plain ExtUtils::MakeMaker or something.

Update June 3rd, 2009: The issue with Module::Install was that it generates an empty tarball when there's no MANIFEST. More recent versions of MI are thankfully very reluctant to do that.

Saturday, February 28, 2009

0.01

registering upload with PAUSE web server
POSTing upload for TAEB-0.01.tar.gz
PAUSE add message sent ok [200]

Friday, February 27, 2009

Code stats

TAEB's AI is pluggable. Most development happens on the Behavioral AI (though there exists another actively-developed AI called Planar by ais523). Since Behavioral is also considered TAEB, I've included its stats here too. A core stat is labeled with [c] and a behavioral AI stat is labeled with [b].

Logical lines of code: 12,299 = 10,077 [c] + 2222 [b]
Commits: 4699 = 4536 [c] + 163 [b]

Sartak commits: 2918 = 2817 [c] + 101 [b]
doy commits: 1029 = 1005 [c] + 24 [b]
sorear commits: 252 = 220 [c] + 32 [b]

Committers: 14 (Sartak, doy, sorear, Sebbe, arcanehl, sawtooth, Jerub, ais523, dho, futilius, bd, Zaba, toft, HanClinto)

.pm files: 164 = 126 [c] + 38 [b]
tar size: 1020K = 844K [c] + 176K [b]

Pretty pictures of TAEB development

I ran Don Stewart's darcs-graph data visualization tool over the main TAEB repositories today. Provides a nice picture of the "burstiness" of TAEB development:

Note: TAEB-AI-Behavioral was part of TAEB before 2009-01-24.

Modules written and improved for TAEB

While working on TAEB, the authors have worked on several other projects to add features and fixes, as well as start new projects. Here's a list of new modules written by TAEB hackers to make TAEB better:

MooseX::Singleton
Sartak rewrote MooseX::Singleton from its unreleased "around new" state; others have built upon it. TAEB has since moved to MooseX::ClassAttribute due it be being faster and better.
MooseX::Role::Matcher
doy wrote MooseX::Role::Matcher as a generalization of old item match code.
Tie::Handle::TtyRec
Sartak wrote this to give TAEB ttyrec output without having to worry about the ttyrec format.
Games::Mastermind::Cracker
Sartak wrote this so that TAEB wouldn't have to use his original C program to open the castle drawbridge.
Log::Dispatch::Twitter
Sartak wrote this to automatically tweet TAEB errors and TAEB deaths.
Log::Dispatch::Channels
doy wrote this to give TAEB an arbitrary number of logfiles with no fuss. Simply writing to a new log name creates a new logfile.
NetHack::Item
NetHack::Menu
NetHack::FOV
NetHack::Monster::Spoiler

All written for expressly TAEB. They've been factored out so that they could have sane test suites and be useful to other projects. NetHack::Item in particular is a gargantuan redesign of TAEB's old item code to fix a number of flaws.

And now a list of modules improved by TAEB hackers for TAEB:

Moose
Sartak and doy have contributed a lot to Moose just for TAEB. (TAEB was Sartak's first Moose application, even)
Devel::REPL
Sartak added quite a few plugins to make TAEB's debug REPL better.
IO::Socket::Telnet
Sartak wrote the original version for Interhack. Added callbacks and made some improvements on day one of TAEB development.
IO::Pty::Easy
doy wrote the original version for Interhack. Made many fixes found through constant use by TAEB.
Continuity
Sartak made the debug logging pluggable to put debug messages from Continuity into TAEB's logfiles.
PadWalker
doy submitted a patch to PadWalker (rt.cpan.org #41710) to fix a crashbug tickled a lot in TAEB's REPL.

0.01 release

Since the 0.01 release of TAEB is forthcoming, I figured I'd put together a place for us to goof around.

The Tactical Amulet Extraction Bot