Welcome to CommAI-env’s documentation!


The code in the CommAI-env models a simulation scenario in which a Learner communicates with an Environment who impersonates a teacher asking the Learner to perform tasks and rewarding it when it does so. The tasks are performed through natural language communication, in the same way as the instructions are given. The communication is performed through a low-level signal, where characters are encoded as bit sequences. The responsible for the encoding and decoding of these sequences are the InputChannel and OutputChannel objects in the Environment. The Learner and the Environment will interchange (exactly) one bit at a time. Obviously, one of them should have nothing to say while the other is speaking. Therefore, a particular bit sequence is going to represent “being silent”. Indeed, all the competition is centered around assigning meaning to particular sequences of bits that encode natural language commands.

The communication between the Learner and the Environment is handled by the Session, which forwards the bits produced by each of the participants and the rewards given to the Learner. For each bit sent, the Session blocks waiting for the next bit to be sent back. The cumulative rewards and time steps are recorded for performance evaluation.

A Learner is just any object that can handle a next method, which given a bit it returns its next bit, and a reward method that informs the learner about a received reward.

The Environment, on the other hand, executes one Task at a time based on a Scheduler’s decision. A Task is defined through a set of messages and rewards that are delivered to the Learner by the Environment as a reaction to different kinds of Events. The possible events are described throughly in Event Handlers. To capture these events, the Environment registers Triggers to an EventManager. The Triggers consist simply on: a type of event, a condition to filter out events by specific details and a callback function that will be invoked when an Event arrises. When an event is handled by some function, it can either modify some internal variables to keep track of some information, or interact with the learner by either setting the message that is being sent through the Environment’s Output Channel and/or set the reward to end the current task. If the reward is given together with a message, the Environment sends out the message and then rewards the learner just before switching to the next task. During this “extra time” no other events are processed. Also, events can be handled concurrently. This implies that different (conflicting) messages (and rewards) could be sent to the learner. Conflicts are solved through defining a priority for the messages.

Finally, a Task can run within a certain World. A World, is composed ultimately of the same elements of a Task: some state variables and Triggers and can interact with the Learner. The goal of this entity is to have consistent across tasks states and behaviors. A Task can access the state variables of the world, and listen for changes on them. The World, on the other hand, cannot access the Task that is being run on.

Running the competition

To run the competition, create a configuration file for the task scheduling, for example, by copying the sample file:

cp tasks_config.sample.json tasks_config.json

Then, run it with:

python run.py tasks_config.json

Testing the competition

If you are running Python 2.7+ you can run all the unit tests by going to src directory and run:

python -m unittest discover


To regenerate a local copy of these documents, do, depending on whether you want html or pdf docs:

cd src/docs
make html


cd src/docs
make latexpdf

This documentation can be made publicly available at the URL https://facebookresearch.github.io/CommAI-env/ simply by checking out the GitHub repo and running the following command at the root of the project:

make gh-pages


A useful tool for debugging is to log what it has been going on. You can log anything from your tasks by:

  1. Add import logging to the top of the file

  2. Create your logger by doing (possibly inside the __init__ method)

    self.logger = logging.getLogger(__name__). Strictly speaking you could send any string to the getLogger function, which stands for the name of your logger, but in this way you just send the name of your working file name, which is clean and save you from thinking of a name.

  3. Whenever you need to log something within your task you can do:

    • self.logger.debug(“spam, spam, spam, eggs, bacon and spam”) to log at the DEBUG level
    • self.logger.info(“spam, spam, spam, eggs, bacon and spam”) to log at the INFO level
    • self.logger.warn(“spam, spam, spam, eggs, bacon and spam”) to log at the WARNING level
    • self.logger.error(“spam, spam, spam, eggs, bacon and spam”) to log at the ERROR level

    The outputs of the loggers are saved to the files errors.log (only messages at the ERROR level), info.log (messages with INFO, WARNING and ERROR level) and debug.log (all messages).

Indices and tables