Login Form

Editors

Activity recognition results on UCF Sports and Holywood2

Table above shows the results, obtained on UCF Sports dataset (http://crcv.ucf.edu/data/UCF_Sports_Action.php). We report recognition rate with respect to the number...


Read More...

Computational efficiency and parallel implementation

The developed algorithms are computationally effective and the compositional processing pipeline is well-suited for implementation on massively parallel architectures. Many...


Read More...

Motion hierarchy structure

Our model is comprised of three processing stages, as shown in the Figure. The task of the lowest stage (layers...


Read More...

Server crash

After experiencing a total server failure, we are back online. We apologize for the inconvenience - we are still in...


Read More...

L1: motion features

Layer L1 provides an input to the compositional hierarchy. Motion, obtained in L0 is encoded using a small dictionary.


Read More...
01234

Task 1.2: Hierarchical motion models for efficient learning and inference

The performance of the hierarchical compositional representations is expected to depend heavily on the methodology, or rules, how to construct each layer in the motion hierarchy. This task will be dedicated to determining the rules of construction and to developing methodology for efficient learning and inference on the hierarchy. We will define the construction rules in line with those proposed in [Fidler2010], which result in a hierarchical networks with tree topologies. We believe that it is exactly the tree structure that is responsible for the benefits of the hierarchical approaches, e.g., in contrast to more complex networks based on general graphs [Hinton2006].

Although the network will be constrained to the tree structure, the hierarchical organization of the compositional models will results in increasingly complex and abstract representations as the information flows to higher levels of hierarchy. Increasingly abstract nature of information goes hand­in­hand with increasing invariance to unwanted variations in input data. In each level of hierarchy, invariance to particular variations in data can be achieved. In object recognition tasks, the robustness of representation is slowly built up by introducing invariance to position and rotation. Similar principles will be applied for activity recognition using compositional models. Due to the complexity of perceived motion, the model will have to be invariant not only to scale, rotation, or position of the observed person, but also to viewpoint changes and nonlinear transformations of the temporal axis. The robust recognition in presence of variation in input data will therefore be the driving force of our hierarchy­construction algorithms.

Despite the two separate visual pathways in human visual system, which process motion related information and shape or object related information, respectively, there is evidence that there exists certain amount of interaction between the dorsal (motion) and ventral (shape) visual streams [Beck2010, Mysore2008, Orban2006]. Despite the fact, that the motion is mainly processed in dorsal stream, there is evidence, that the shape that originated in motion (e.g. the shape that originated in motion of well camouflaged animal against cluttered background) is also processed in the ventral stream, which is specialized for shapes. It is obvious, that such structure offers certain advantages: for example, shapes can be learned in one way (e.g. observed directly), and obtained knowledge can be applied to different modality ­ that is, to the recognition of shapes, which are obtained from motion alone. Similarly, Ogale and Aloimonos [Ogale2007] argued that the performance of early vision is largely based on the interplay among visual modules and therefore compositional approach is necessary.

Based on this, we expect that it would be possible to fuse the algorithms for shape recognition and motion ­ activity recognition, resulting in algorithms with much better performance. Shape­based hierarchical compositional models have already been developed [Fidler2007], and it is our plan to fuse such a model with the newly developed hierarchical compositional model geared towards activity recognition. The result of such fusion would be an algorithm, which would be able to understand the surrounding environment much better than current state-­of-­the-­art approaches.

In principle, the combined model will be composed from the shape­based hierarchical compositional model, and motion based hierarchical compositional model. Nevertheless, the exact mechanisms for exchange of information between the two models are to be determined. There are many open questions regarding such fusion: should the information be exchanged on multiple levels, and if yes, on which levels? What is the amount and nature of information that should be exchanged between the models? In this task, we will answer those questions and design preliminary structure of the combined model.

This website uses cookies to manage authentication, navigation, and other functions. By using our website, you agree that we can place these types of cookies on your device.

View e-Privacy Directive Documents

You have declined cookies. This decision can be reversed.

You have allowed cookies to be placed on your computer. This decision can be reversed.