TraDaG  1.0
Training Data Generator
Getting Started

Table of Contents

This page provides a quick entry point to start coding with TraDaG.

The Wrapper Way

The CVLDWrapper class provides a comfortable interface for the most common use cases of the CVLD members. Simply construct an object of the class, use the various setter functions to define your parameters, and start generating scenes. If the functions provided by this wrapper are not enough for you, please consider the generic way.

To construct the wrapper object, you need to provide the absolute or relative path to your data set. This directory is expected to contain the subdirectories depth, rgb and label, which contain the depth, color and label images, respectively. For TraDaG to know which depth, color and label images belong together, they need to have the same file name, e.g. the files depth/myscene.png, rgb/myscene.png and label/myscene.png will be interpreted to represent the same scene and will be used together. Files without an equivalent in both of the other directories will not be considered as a valid scene and will be ignored.

If you want to make use of precomputing planes and storing them to disk or to read existing planes from disk, the data set directory also needs to contain a plane subdirectory. This is not required, though; if it does not exist, you just will not be able to make use of plane precomputation through this wrapper.

In addition, the wrapper requires a CameraManager instance. This is a very simple class which holds all the camera parameters that were used to record the data set. TraDaG needs to know these parameters to accurately reconstruct the 3D scene from the point of view of the recording cameras. To construct a CameraManager instance, you need to pass the camera parameters to its constructor. For more information, refer to its API documentation.

The wrapper also requires a label map. This is a mapping from strings to grey values with 16 bit depth, i.e. in the [0, 65535] interval. These grey values are the values stored in the label images to represent the different labels. For the NYU Depth V1 and NYU Depth V2 data sets, sensible label maps are already predefined in util.h. If you include this header, you can use the NYUDepthV1 and NYUDepthV2 constants, if you wish.

The last parameter is to limit the amount of scenes that will be processed. It is mostly there to have some kind of an upper bound on the computation time when searching through all the scenes, and is most useful when working with huge data sets. In most cases, it should be okay to leave this at unlimited. For more information, see the CVLDWrapper::CVLDWrapper API documentation.

When you have successfully constructed a wrapper object, you have access to various setters to define the parameters of your simulations. Most values are set to a reasonable default, but you may change them as you wish (again, refer to the CVLDWrapper API documentation for more details). To start generating training images, you must define which labels you want to use, however; you can do this using the labelsToUse and setLabelsToUse functions. You must define at least one label.

You probably also want to set the active object, i.e. which object will be dropped into the scene. You can do this using the setActiveObject function, which takes a numeric object ID as a parameter (it is also possible to use multiple objects at the same time, but not through this wrapper). The object IDs are 1-based and assigned contiguously in the [1, getNumObjects] interval. By default, 13 different objects are available. If you want to add new objects or remove some, you can use the availableObjects vector. Keep in mind that making this vector smaller also invalidates some object IDs.

Now you are ready to generate some training images! There are three methods of doing so through the wrapper, which are defined through the overloads of the getTrainingImage function, providing different levels of control over the object pose and the selected scenes. They all have in common that they will select a random scene (which matches your specifications), fit a plane into the scene using one of the allowed labels (or use a precomputed plane, if available), and drop the object onto the plane (according to the restrictions passed to the function). The simulation is preformed up the the maximum attempts per scene. If no good result is produced, it will move on to the next (randomly selected) scene. If all scenes fail to produce a good result, the overall best attempt will be returned, and a flag will indicate that the result is not optimal.

The simplest overload only requires you to specify what occlusion you will accept as a "good" result. No restrictions on the scene (aside from the allowed labels) or on the object pose are enforced.

The next overload does not restrict the scene, either, but allows you to define the initial rotation of the object and an initial velocity that the object will have when it "appears" in the scene. The final object pose is not affected by this.

The final overload is the most restrictive of the three. It allows you to define the final pose of the object by specifying its final rotation (with a tolerance) and its distance to the camera, in addition to the usual occlusion. The scenes are also restricted by the final object rotation, because only scenes containing planes with a normal that matches the object's up-vector (within the given tolerance) and a distance within the specified interval will be selected.

All three overloads return a TrainingImage containing the rendered scene with the object and additional information, such as object coordinates and occlusion. In addition, a status is returned to indicate success and failure, amongst others.

Precomputing planes

As mentioned a few times above, you can precompute planes for your scenes and store them to files on the hard drive. The wrapper provides this functionality in the precomputePlaneInfo function. You must pass one or more labels (and optionally a plane normal with a tolerance) to this function; it will then search the data set for matching scenes and try to fit a plane for each scene and each label (that is contained in the current scene). The computed planes will be stored in .planeinfo files in the plane subdirectory of the data set path that was specified in the constructor. If this subdirectory does not exist, the computation will fail.

When precomputed planes are available at the time of requesting a training image, they will always be parsed first before attempting to fit a new plane (if the saved ones do not match the current requirements). If you want you simulation to only rely on the precomputed planes and discard a scene if those planes are not adequate (e.g. to save time if you know that you precomputed all the relevant planes), you can set the parameter for this to false.

The Generic Way

If you decided to not use the wrapper, you have a lot more and powerful tools at your disposal; however, you will probably need to consult the API documentation a lot more, too. ☺

The most important classes here are

SceneAnalyzer

The SceneAnalyzer class manages your data set and allows you to search for scenes by labels or by planes (see e.g. beginByLabel and beginByPlane). It also allows you to iterate over the search results. It has functions for precomputing planes similar to the wrapper and will also parse saved plane files before attempting to fit a new plane into a scene (which is obvious if you consider that the wrapper uses this class for its search operations).

SceneAnalyzer assigns an ID to all the scenes it manages, in alphabetical order of the file name of the depth image belonging to the scene. When searching for planes, the result will always contain this scene ID, which can then be used with the other functions. For example, when you have found a scene that you want to use for a simulation, you can call createSimulator with the scene ID to comfortably construct a Simulator for this scene. If you have the plane available at this time, you can even pass it directly to createSimulator and it will be registered automatically (alternatively, you can use setGroundPlane on the Simulator instance to specify the plane after construction).

Simulator

This class manages the simulation of dropping objects for a single scene. It facilitates the creation of one or more DroppableObjects to use for the simulation and allows setting parameters like the gravity vector or the maximum attempts. Object-specific parameters like desired occlusion can be set directly on the objects returned by createObject. When you are ready to start the simulation, call execute. This function will run the simulation and repeat it up to max attempts times, or until an optimal result (i.e. a result where all restrictions are met) is found. In the case that no optimal result can be computed within the max attempts, the best one found will be returned. The result also contains a status to indicate success and failure, amongst others. The parameters of the dropped objects after the simulation can be retrieved from the DroppableObjects themselves (see below).

DroppableObject

A DroppableObject represents a single object that can be dropped into a scene. It should always be created through the Simulator instance you are currently using by calling its createObject function - otherwise it won't know that you have created a new object. You can create as many objects as you want by calling the createObject function multiple times.

When creating an object through the simulator, it will return a pointer to the newly created object. Use this pointer for all simulation settings that need to be made on a per-object basis, like the desired occlusion or that the object must not tilt over. After performing a simulation, you can read out the simulation result for the object using getFinalOcclusion, getFinalRotation etc. (see the API documentation for a complete list).

GroundPlane

A GroundPlane represents the plane that the objects will be dropped on during simulations. It consists of a simple plane definition, i.e. a normal and a distance along this normal, and a set of vertices that are considered inliers of the plane. Normally, you won't need to construct these planes yourself; they are usually created by TraDaG by fitting them into the scenes using a RANSAC-based approach.

When you use a SceneAnalyzer, it will give you GroundPlanes for the search results if you search for scenes by plane. If you don't use this and only work on a single scene, you have the option to use the ImageLabeling class to manually fit a plane.

You can only use a single GroundPlane at a time for a simulation. To register it with the Simulator, call the setGroundPlane function. When using a SceneAnalyzer, you can also directly create the Simulator with a pre-registered GroundPlane by calling the createSimulator overload.

Some simulation parameters can be set on the GroundPlane object, like the restitution and friction.

Other Classes

The CameraManager class holds all the camera parameters of the data set (or the scene) you are working with. You usually only need to construct an object of this class once and then pass it to the various classes that require it.

ImageLabeling is a class that can be used to fit planes into scenes. If you use a SceneAnalyer, it will already do this for you and you shouldn't need to interact with this class. If you only work with a single scene, you can use this class to get your GroundPlane (see findPlaneForLabel).

There are some more classes, but normally you don't need to concern yourself with them. If you still want to learn more about them, refer to the API documentation.