Machine Perception Laboratory


MPLAB on The New York Times, Oct 15, 2013

MPLAB on The New York Times

CERT on CNN. Full story here:

Here is a youtube of the Reach For Tomorrow activity organized by the MPLab:

Here is an article disclosing that MS Cambridge was the main force on the development of the Kinect Software

New paper shows that brain predicts consequences of future eye movements

Interesting article talks about Ros Picard’s company among other things.

Interesting talk about combining foveal views and and training a controller to lear where to look. Hugo Larochelle, Geoffrey Hinton

Nice Talk on: Learning to combine foveal glimpses with a third-order Boltzmann machine. Has images of our own Josh Susskind in the paper

Martin Banks on an issue for 3D broadcasting of NFL: 3D at a distance requires wide-baseline which makes players look tiny. No easy fix.

Poster session Wed. Very cool work: On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient (see NiPS proc.)

Twitter at MPLab is working again

Learning To Count Objects in Images (Victor Lempitsky, Andrew Zisserman)

2D for training 3D for test. Size of objects is determined from image metadata

Francis Bach: Structured sparsity-inducing norms through submodular functions

Modeling paper on the role of fixations in decision making

Wired Gadget Magazine interviews Javier Movellan

Social Robot at UW

For the past 2 weeks Max, an african grey parrot has been visiting the lab.

A group at Brown has announced a ROS version of the RL-Glue reinforcement learning framework. From the article, “rosglue is designed to enable RL researchers and roboticists work together rather than having to reimplement existing methods in both fields.”

See it here:

Nice article about Matarić’s robot for interacting with autistic kids:

February 4th, 2009 MPLab at TED

Researchers at the University of California San Diego have created a life-like robot that can be taught to make facial expressions:

Terry Sejnowski has been elected member of the national academy of science

Vision Processor:

Paper on towel folding robot

Willow Garage Robot Folding Towels

Smart Phones for Teaching Algebra

Information Seeking Neurons

Emot Aware Tutoring System

From Harper’s Magazine “Yearly Review”, “Scientists [Tingfan Wu and Javier Movellan] in San Diego made a robot head study itself in a mirror until it learned to smile”

ICDL: August 18-21 2010. University of Michingan. Ann Arbor.

Head mounted cameras for polce officers:

Hal Pashler Debunks evidence for learning styles.

NYT: Article on Studying young minds on how to teach them.

Robovie used as shopping companion

[Xiang et al.] Dynamic spatially smoothed regularization ensures boosting to select clustered but not scattered features.

[Seeger.] Informax fMRI pulse seq. design, extends previous single 2d slice work to 3d volume optimization, which shows some improvement.

[Berkes et al] No evidence of active sparsification in V1. They paralyzed some inputs of V1, the result mismatches sparse model prediction.

[Zinkevich] An asynchronous parallel SGD implementation showed that delayed propagation of new gradient leads to faster convergence.

[Cavagnaro] (standard D-opt) experiment design for human memory experiment. Compared 3 different short-term memory model.

[Bengio et al.] class dependent feature selection, similar to “personal spam filter”. Mixed norm regularization generate sparses features.

[Schmidt] blind source separation with arbitrary linear/equality constraint on source. Solved using MCMC

Vul gave a talk on an ideal observer model for multiple object tracking using particle filters. Accounts for human experiments. Poster 2nite

Graph-based consensus maximization: incorporate grouping constraints with outputs of classification algorithms

Hinton gave a talk on extending RBMs to higher order interactions and applying it to a range problems with impressive results.

Hsu and Griffith use generative versus discriminative models of language learning to help give insight into the debate on nativism of lang.

[Zoran and Weiss] reported that edge filter could be obtained from natural images using maximum tree dependency algorithms(nonsparse code!).

[Ouyang and David] An Bayesian framework for realtime handdrawn sketch recognition with “component recog./context/continuity” likelihood.

[Blaschko et al] joint learning and projecting labeled data onto the manifold created from unlabeled data improves regression performance.

[Fujiwara et al.] Bayesian CCA to reconstruct visual stimuli from fMRI. The learned basis varies by eccentricity from 1px to 4px patterns.

ROC or Accuracy? Corinna Cortes and Mehryar Mohri[nips04], Davis & Goadrich [ICML2006], Ulf Brefeld Tobias Scheffer[icml05]

Sahand Negahban analyzes error bounds of sparse or low rank parameters in Lasso regression/covariance problem.

Sergio Verdu gives a nice (invited) talk on “relative entropy”. More ‘EE’ perspective such as compression/coding than CS though.

Tony Torralba has a extensive and “exhausted” tutorial of object/scene recognition. 1 paper per slide! good summary but not very systematic

Antoni Torralba has a extensive and “exhausted” survey of object/scene recognition. One paper per slide!

Uncanny Valley in Monkeys

Article on AudioVisual Sound Separation with active camera

Javi: Diego-San the humanoid robot for Project One is arriving today in the afternoon!

Javi: The “Diego” humanoid robot will be arraving at UCSD by October 23 2009.

Javi: Lab Meeting Wed 6 at 10 am. Nicholas Butko will present work on automatic analysis of tutoring.

Javi: Airport Screening Using Expression Recognition

javi: bibdesk is a great free program to manage bib files. It allows drag and drop of pdf and urls to keep track of where your papers are

Nick: Enthought is a python dist. that seeks to replace Matlab. It seems nice but it’s really slow! Loading a few images ate all my RAM.

Javi: Target Article on the ethics of Robot Nanies

Javi: Here is a cute looking exercise robot from Japan

Nick: Here is an example of how to parallelize a data-parallel C for-loop in OSX 10.6 with Grand Central Dispatch:

Nick: For Einstein Demo startup we have found AppleScript & Automator useful. has good tutorials for both.

Nick: OSX 10.6 is released Friday with a big push for OpenCL, which could make our programs much faster:

Nick: Interesting study from Glasgow showing cultural differences in how faces are “read” for facial expressions:

Nick: The Mac command-line program “say” can be used for text-to-speech. We may want to use this for Einstein/tutoring.

Nick: Pascal Poupart’s minimally sufficient explanations give reasons why a policy’s actions are better than all other actions in a state.

Nick: Sudoku Grab is an App that uses OpenCV to read a picture of a Sudoku puzzle and lets you play it on your iPhone:

Nick: Toyota has a robot biped with very impressive balance and running:

Prof. Peyman Milanfar (UC Santa Cruz, EE Dept.)

Nick: New Scientist has an article about using real/fake smiles for marketing purposes:

Nick: Christoph Lampert’s student had the ECCV2008 Best Student Paper on efficient object localization:

Andrew Ng new paper on Learning Sound Location from a Single Microphone

Andrew Ng has new paper on Near-Bayesian Exploration in Polynomial Time

Nick@CVPR: Jitendra Malik’s keynote argued vehemently against sliding windows. His method is slow but not bad. Maybe foveation can help.

Nick@CVPR: HRL (formerly Hughes Research Lab) in Malibu does bio-inspired robots as a defense contractor & is interested in MPLab’s work.

Nick@CVPR: Shankar Shivappa is Mohan’s student. He’s at Microsoft for summer but he’d be happy to talk to MPLab about AV fusion in the fall.

Nick@CVPR: Mohan Trivedi has students working on Audio/Visual Integration. We may want to talk to them about ideas for Einstein.

Nick@CVPR: Jeff Cohn from CMU/Pitt finds some pain AUs are easier to find with shape (AAM) features, others with appearence (DCT) features.

Nick@CVPR: Performance & Evaluation of Tracking & Surveilance (PETS) Workshop began at Face & Gesture. Each year they publish a new dataset.

Nick@CVPR: Ta et al. used SURF features similar to SIFT to track feature points & do recognition on Nokia cell phones.

Nick@CVPR: Piotr Dollar compared 12 methods for pedestrian tracking. The fastest is about 0.1FPS so he recommends we try MIPOMDP there.

A reading robot from Waseda University. Note the design reminiscent of RUBI3.

Nick@CVPR: Denzler’s studens check if two cameras are aligned by probabilities of point wise correspondence. Low entropy means aligned.

Nick@CVPR: Bruce at Inria uses dist. of Harris detectors in Natural inputs to choose which corners to accept. A lot can still be done here.

Nick@CVPR: Bolme et al. from Colorado have a way to learn good filters for eye detection by learning many okay filters and averaging them.

Nick@CVPR: Both Fei-Fei Li and Jitendra Malik get a lot out of over-segmenting images & then inferring which segments go with which object.

Nick@CVPR: Activity Recognition now is like Object Recognition 4 years ago: datasets are too easy and they’re turning to YouTube for data.

Nick@CVPR: 2nd best student paper- Tensor based graph matching. Make graphs of feature points. Does it match an object in your training set?

Nick@CVPR: Best student paper-Torralba’s student. Match pixels in two scenes using optical flow on sift features at every pixel & warping.

Nick@CVPR: 2nd best paper: comparison of blind deconvolution algorithms: image blurred w/ unknown kernel, want to recover image and kernel.

Nick@CVPR: Best paper: Haze Removal With a Dark Channel Prior. DCh is min(min(rgb) in local patch). Nat. Im.s have DCh ~ 0, hazy images >>0.

Nick@CVPR: FaceL is a SourceForge project for naming people in front of a kiosk using opencv+libsvm. We may want to modify it for Einstein.

Nick@CVPR: Oxford’s Visual Geometry Group has a good dataset for face tracking based on Buffy episodes. They give out labels but no video.

Here are directions for compiling OpenCV libraries that are suitable for the iPhone:

Slate has a pretty scathing review of Wolfram Alpha

This should be a big lesson for us. Remember, under promise and over deliver. It’s better to do a few simple things really well than to have huge bredth without depth.

Talk announcement:

The UCSD Department of Psychology is honored to present a talk by
David Matsumoto
San Francisco State University

“Human Facial Expressions of Emotion: New Empirical Findings and Theoretical Advances”

On Thursday, May 14, at 4:00 pm

Location: The Crick Conference Room
Mandler Hall, room 3545


Since the groundbreaking work establishing the universality of facial expressions of emotion, many new findings have inspired new theoretical advances into our understanding of facial expressions of emotion. In this presentation, I will describe some of these new findings and understandings, especially in the area of microexpressions, deception, dangerous intent. I will also describe recent work from my laboratory isolating potential sources of facial expressions of emotion, as well as new work on new expressions, and on the interpersonal functions of emotional expression.



According to scientists it is possible to predict whose marriages will fail by looking at photographs taken decades earlier

According to the paper, they estimate intensity of smile by summing the FACS coding intensity (neutral:0, A-E:1-5) of AU6+AU12.

Study 1: Photos of  300 married subjects are collected from college yearbook. Among them, 55 were divorced.

Study 2: Childhood photos (age 5-22) provided by 55 subjects.

Logistic Regression was used to learn the predicative model (Divorced?) ~ (Smile Intensity).

There is no performance measurement except for some significance tests.

In the two years since then, he said, CB2 has taught itself how to walk with the aid of a human and can now move its body through a room quite smoothly, using 51 “muscles” driven by air pressure.

Address = {London, UK},
Author = {Nicholas J. Butko and Javier R. Movellan},
Booktitle = {Proceedings of the International Conference on Development and Learning (ICDL)},
Title = {Learning to Learn},
Year = {2007}}

Author = {Nicholas J. Butko and Javier R. Movellan},
Booktitle = {Proceedings of the International Conference on Development and Learning (ICDL)},
Month = {August},
Title = {{I-POMDP: A}n Infomax Model of Eye Movement},
Year = {2008}}

Author = {Nicholas J. Butko},
Howpublished = {  \url{}},
Title = {{N}ick’s {M}achine {P}erception {T}oolbox},

Author = {Nicholas J. Butko and Javier R. Movellan},
Booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
Title = {Optimal Scanning for Faster Object Detection},
Year = {2009}


Author = {Nicholas J. Butko and Ian R. Fasel and Javier R. Movellan},
Booktitle = {Proceedings of the IEEE International Conference on Development and Learning (ICDL)},
Title = {Learning About Humans During the First 6 Minutes of Life},
Year = {2006}

The Cognitive Science Distinguished Speaker Series presents
Michael Tomasello, Ph.D.,
Max Planck Institute for Evolutionary Anthropology

Collaboration and Communication in Children and Chimpanzees
Monday, 6 April 2009, 4 – 6p
Center Hall 216

Human beings share many cognitive skills with their nearest primate relatives, especially those for dealing with the physical world of objects (and categories and quantities of objects) in space and their causal interrelations. But humans are in addition biologically adapted for cultural life in ways that other primates are not. Specifically, humans have evolved unique motivations and cognitive skills for understanding other persons as cooperative agents with whom one can share emotions, experience, and collaborative actions (shared intentionality). These motivations and skills first emerge in human ontogeny at around one year of age, as infants begin to participate with other persons in various kinds of collaborative and joint attentional activities. Participation in such activities leads humans to construct during ontogeny perspectival and dialogical cognitive representations

Chimpanzee Social Cognition
Tuesday, 7 April 2009, 11a – 12:15p
SSB 107

After years of debate about whether chimpanzees do or do not have a “theory of mind”, recent research suggests that the question must be asked in a more differentiated way. Thus, there is currently very good evidence that chimpanzees understand that others have goals, and even intentions in the sense that actors choose a behavioral means to their goal in light of the constraints of the situation. Similarly, there is currently very good evidence that chimpanzees understand that others see things, and even know things (in the sense of having seen them previously). Nevertheless, despite several seemingly valid attempts, there is currently no evidence that chimpanzees understand false beliefs. Our conclusion for the moment is thus that chimpanzees understand others in terms of a perception–goal psychology, as opposed to a full-fledged, human-like belief–desire psychology.

INC 2009 Rockwood Lecture

Josh Bongard

Department of Computer Science
Vermont Advanced Computing Center
University of Vermont

Friday, March 20, 2009 11 am
Cognitive Science Building, Room 003


Intelligent robots must be able to not only adapt an existing behavior on the fly in the
face of environmental perturbation, but must also be able to generate new, compensating
behavior after severe, unanticipated change such as body damage. In this talk I will
describe a physical robot with this latter capability, a capability we refer to as resiliency.
The robot achieves this by (1) creating an approximate simulation of itself;
(2) optimizing a controller using this simulator; (3) using the controller in reality;
(4) experiencing body damage; (5) indirectly inferring the damage and updating the simulator;
(6) re-optimizing a new controller in the altered simulator; and (7) executing this compensatory
controller in reality. I will also describe recent work generalizing this approach to robot teams.

Host: Terry Sejnowski

I was recently doing something like the following in Matlab:

im1 = imread(‘im1.jpg’);

im2 = imread(‘im2.jpg’);

if sum((im1(:)-im2(:)).^2) < 1e-5, disp(‘Image 1 and 2 are duplicates!!!’), end

So the point of these three lines of code is to compare image 1 and image 2, and if they are sufficiently close (exactly or nearly exactly), call them duplicates.

The preceding code has an interesting non-obvious bug:  For integer-valued jpgs, Matlab loads them as type uint8. The above test will be passed not only if all pixels are the same, but also if every pixel in im2 is brighter than every pixel in im1. This is because, with unsigned integers, 1-2 = 0.

So I had a very washed out image in my image set that was testing as a duplicate of many images, just because nearly all pixels in the saturated image were 255, and x-255=0 for all unsigned x.;323/5918/1222,0,748518.story

Georgios at Intel gave me a heads up about this teaching robot in Japan:\03\06\story_6-3-2009_pg9_9

A buzzword I got from the complete intelligence workshop was “Markov Logic”. Here is an easy to read intro paper:

Here is a new journal that may be a good venue for our work

DATE: Wed, March 4
TIME: 4:30-6
SPEAKER: Chris Johnson, Dept of Cognitive Science, UCSD

Negotiating Carries: Gesture Development in Mother-Infant Bonobos
C Johnson, S-L Zastrow and M. Halina


The emergence of gesture in captive bonobos (Pan paniscus) was investigated in longitudinal case studies of three mother-infant dyads at the San Diego Zoo and Wild Animal Park. Videotape, shot on site over the period in which the infants were 10-18 months of age, was reviewed and about 700 examples of attempted or completed carries were collected. Criteria for determining when an interaction was “carryesque”, even when it did not end in a carry, involved identifying a normative, dynamic configuration of co-action and co-attention. The relative moves in these trajectories were further analyzed as being compatible or incompatible with a carry, and as involving configuring one’s own body, manipulating the body of another, or gesturing. Defining gesture as any other-directed, non-forceful, non-carry-enacting move in context, we were able to identify two major classes of gestures during carries in these dyads: attention-getting and iconic. The performance of these gestures showed a marked sensitivity to the attentional state of the other animal. The social ecology of each dyad – situating them in the larger political setting of their respective social groups, and including each dyad’s propensities for compatibility, manipulation, etc. – is argued as exerting selective pressure for certain types of mother-infant negotiations. Of particular interest was one dyad in which a period of both high incompatibility by the mother, and high levels of compatible initiation and manipulation by the infant, resulted in greatly extended negotiations in which carry-specific gestures emerged. Additional preliminary micro-analysis of these gesture-mediated interactions and their predecessors suggests that this development may have involved the salient “freezing” of a normally-continuous, role-specific enactment of the carry, and ultimately the generation of a dissociable gesture routine.

Christine M. Johnson, Ph.D.
Department of Cognitive Science
U.C. San Diego
La Jolla, CA 92093-0515
Phn: 858-534-9854
Fax: 858-534-1128

The UCSD Department of Cognitive Science is pleased to announce a talk by

Bilge Mutlu

Carnegie Mellon University

Monday, February 23, 2009 at 12pm
Cognitive Science Building, room 003

“Designing Socially Interactive Systems”

Recent advances in artificial intelligence and speech recognition have enabled a new genre of computer interfaces that promise social and cognitive assistance in our day-to-day lives. Humanlike robots, one family of such interfaces, might someday provide social and informational services such as storytelling, educational assistance, companionship using complex, adaptive real-world interactions. In my research, I harness existing knowledge of human cognitive and communicative mechanisms and generate new knowledge in order to design these systems such that they more effectively yield social and cognitive benefits. In this talk, I will present a theoretically and empirically grounded framework for designing social behavior for interactive systems. This process draws on theories of social cognition and communication and formal qualitative and quantitative observations of human behavior, and produces computational models of social behavior that can be enacted by interactive systems. I will present a series of empirical studies that demonstrate how this framework might be used to design social gaze behaviors for humanlike robots and how participants show social and cognitive improvements particularly, better recall of information, more conversational participation, and stronger rapport and attribution of intentionality led by theoretically based manipulations in the designed gaze behavior. I will also present a vision for future work in this area that provides a framework for interdisciplinary research, drawing on knowledge from and contributing to research in social cognition, human-computer interaction, machine learning, and computational linguistics.

Reminder: proposals are due March 1st, 2009.

Boston MA, USA
2-6 November 2009
Special sessions in main conference: 2-4 November 2009

********* Special Session Proposal Deadline: 1 March 2009 *******

Acceptance Notification: 22 March 2009

The ICMI and MLMI conferences will jointly take place in the Boston
area during November 2-6, 2009. The main aim of ICMI-MLMI 2009 is to
further scientific research within the broad field of multimodal
interaction, methods and systems. The joint conference will focus on
major trends and challenges in this area, and work to identify a
roadmap for future research and commercial success. The main
conference will include a number of sessions. Each special session
should provide an overview of the state-of-the-art, present novel
methodologies, and highlight important research directions in a field
of special interest to ICMI participants. Topics of special sessions
should be focused rather than defined broadly.

Each special session should comprise of 4-5 invited papers. It is
encouraged that the session begins with an overview paper on the topic
being addressed and that the remaining papers follow up with technical
contributions on the topic.

The following information should be included in the proposal:

* Title of the proposed special session
* Names/affiliation of the organizers (including brief bio and
contact info)
* Session abstract (state significance of the topic and the
rationale for the proposed session)
* List of invited presenters (including a tentative title and a
300-words abstract for each paper)

Proposals will be evaluated based on the timeliness of the topic and
relevance to ICMI, the potential impact of the sessions, the quality of
the proposed content, and the standing of the organizers.

Please note that all papers in the proposed session should be reviewed
to ensure that the contributions are of the highest quality. The
organizer(s) of accepted special sessions will arrange the review
process, except that the review of papers submitted by the organizers
themselves will be handled by the special session and program co-chairs.
Once all the papers belonging to a special session are reviewed, the
final acceptance of the session will be based on submitting the whole
package to the program co-chairs.

Important Dates for Special Session Submission:

1 March 2009: Proposal for special sessions due.
22 March 2009: Decision for special session proposal due.
29 May 2009: Submission of special session papers to organizers.
15 July 2009: Acceptance notification.
1 August 2009: Submission of the whole package and final versions of papers

To submit special session proposals (as pdf) or for additional information
regarding the special sessions, please email



Applications Deadline: March 1st, 2009

Sunday June 28th – Saturday July 18th, 2009, Telluride, Colorado

Ralph Etienne-Cummings, Johns Hopkins University
Timothy Horiuchi, University of Maryland, College Park
Tobi Delbruck, Institute for Neuroinformatics, Zurich

2009 Topic Leaders:
Cognitive Systems: Gregor SCHOENER (Ruhr-Universität-Bochum) and
Josh BONGARD (Univ. Vermont)
Robotics/Locomotion/Motor: Javier MOVELLAN (UC San Diego) and
Tony LEWIS (Univ. Arizona)
Vision: Bert SHI (HKUST) and Shih-chii LIU (INI-Zurich)
Audition: Mounya EL HILALI (JHU) and Hynek HERMANSKY (JHU)
Technology/Techniques/Tutorials: Paul HASLER (GA Tech) and Jon
TAPSON (Univ. Capetown)
Neuromorphic VLSI: John HARRIS (Univ. Florida) and John ARTHUR
(Stanford Univ.)
Computational Neuroscience: Terry SEJNOWSKI (Salk Institute)

Workshop Advisory Board:
Andreas Andreou (Johns Hopkins University)
Andre van SCHAIK(University of Sydney)
Avis COHEN (University of Maryland)
Barbara SHINN-CUNNINGHAM (Boston University)
Giacomo INDIVERI (Institute of Neuroinformatics, UNI/ETH Zurich,
Rodney DOUGLAS (Institute of Neuroinformatics, UNI/ETH Zurich,
Shihab SHAMMA (University of Maryland)

We invite applications for a three-week summer workshop that will be
held in Telluride, Colorado from Sunday June 28th – Saturday July 18th,
2009. The application deadline is *Sunday, March 1st* and application
instructions are described at the bottom of this document.

The 2009 Workshop and Summer School on Neuromorphic Engineering is
sponsored by the National Science Foundation, Institute of Neuromorphic
Engineering, Air Force Office of Scientific Research, Institute for
Neuroinformatics – University and ETH Zurich, Georgia Institute of
Technology, University of Maryland – College Park, Johns Hopkins
University, Boston University, University of Sydney, and the Salk Institute.

Previous year workshop can be found at: and last
year’s wiki is .


Neuromorphic engineers design and fabricate artificial neural systems
whose organizing principles are based on those of biological nervous
systems. Over the past 12 years, this research community has focused on
the understanding of low-level sensory processing and systems
infrastructure; efforts are now expanding to apply this knowledge and
infrastructure to addressing higher-level problems in perception,
cognition, and learning. Inthis 3-week intensive workshop and through
the Institute for Neuromorphic Engineering (INE), the mission is to
promote interaction between seniorand junior researchers; to educate new
members of the community; to introduce new enabling fields and
applications to the community; to promote on-going collaborative
activities emerging from the Workshop, and to promote a self-sustaining
research field.


The three week summer workshop will include background lectures on
systems and cognitive neuroscience (in particular sensory processing,
learning and memory, motor systems and attention), practical tutorials
on analog VLSI design, mobile robots, hands-on projects, and special
interest groups. Participants are required to take part and possibly
complete at leastone of the projects proposed. They are furthermore
encouraged to become involved in as many of the other activities
proposed as interest and timeallow. There will be two lectures in the
morning that cover issues that are important to the community in
general. Because of the diverse range of backgrounds among the
participants, some of these lectures will be tutorials, rather than
detailed reports of current research. These lectures will be given by
invited speakers. Projects and interest groups meet in the late
afternoons, and after dinner. In the early afternoon there will be
tutorials on a wide spectrum of topics, including analog VLSI, mobile
robotics, auditory systems, central-pattern-generators, selective
attention mechanisms, cognitive systems, etc.


The summer school will take place in the small town of Telluride, 9000
feet high in Southwest Colorado, about 6 hours drive away from Denver
(350 miles). Great Lakes Aviation and America West Express airlines
provide daily flights directly into Telluride. All facilities within the
beautifully renovated public school building are fully accessible to
participants with disabilities. Participants will be housed in ski
condominiums, withinwalking distance of the school. Participants are
expected to share condominiums.

The workshop is intended to be very informal and hands-on. Participants
are not required to have had previous experience in analog VLSI circuit
design, computational or machine vision, systems level neurophysiology
or modeling the brain at the systems level. However, we strongly
encourage active researchers with relevant backgrounds from academia,
industry and national laboratories to apply, in particular if they are
prepared to work on specific projects, talk about their own work or
bring demonstrations to Telluride (e.g. robots, chips, software).
Wireless internet access will be provided. Technical staff present
throughout the workshops will assist with software and hardware issues.
We will have a network of PCs running LINUX and Microsoft Windows for
the workshop projects. We encourage participants to bring along their
personal laptop.

No cars are required. Given the small size of the town, we recommend
thatyou do not rent a car. Bring hiking boots, warm clothes, rain gear,
and a backpack, since Telluride is surrounded by beautiful mountains.

Unless otherwise arranged with one of the organizers, we expect
participants to stay for the entire duration of this three week workshop.


Notification of acceptances will be mailed out around mid March 2009.
The Workshop covers all your accommodations and facilities costs. You
areresponsible for your own travel to the Workshop. For expenses not
covered by federal funds, a Workshop registration fee is required. The
fee is $550 per participant, however, due to the difference in travel
cost, we offer a discount to non-US participants. European registration
fees will bereduced to $300; non-US/non-European registration fees will
be reduced to $150. The cost of a shared condominium will be covered for
all academicparticipants but upgrades to a private room will cost extra.
Participants from National Laboratories and Industry are expected to
pay for these condominiums.

—— HOW TO APPLY: ——-

Applicants should be at the level of graduate students or above (i.e.
postdoctoral fellows, faculty, research and engineering staff and the
equivalent positions in industry and national laboratories). We actively
encourage women and minority candidates to apply.

Anyone interested in proposing specific projects should contact the
appropriate topic leaders directly.

The application website is (after January 1st, 2009):

Application will include:

* First name, Last name, Affiliation, valid e-mail address.
* Curriculum Vitae.
* One page summary of background and interests relevant to the workshop,
including possible ideas for workshop projects.
* Two letters of recommendation (uploaded directly by references).

The application deadline is March 1, 2009.
Applicants will be notified by e-mail.

1 January, 2009 – Applications accepted on website
1 March, 2009 – Applications Due
mid-March – Notification of Acceptance


Ralph Etienne-Cummings
Associate Professor
Department of Electrical and Computer Engineering

105 Barton Hall/3400 N. Charles St.
Johns Hopkins University
Baltimore, MD 21218
Email: E

Tel: 410 – 516 – 3494
Fax: 410 – 516 – 5566

The UCSD Department of Cognitive Science is pleased to announce a talk by

Björn Hartmann

Stanford University

Thursday, February 19, 2009 at 12pm
Cognitive Science Building, room 003

“Enlightened Trial and Error – Gaining Design Insight Through New Prototyping Tools”

“The progress of any creative discipline changes significantly with the quality of the tools available. As the diversity of user interfaces multiplies in the shift away from personal desktop computing, yesterday’s tools and concepts are insufficient to serve the designers of tomorrow’s interfaces. My research in human-computer interaction focuses on the earliest stages in UI creation – activities that take a novel idea and transform it into a concrete, interactive artifact that can be experienced, tested, and compared against other ideas. In this talk I will give an overview of different prototyping tools I have built with collaborators to address two research questions: How can tools enable a wider range of designers to create functional prototypes of ubiquitous computing interfaces? And how can design tools support the larger process of learning from these prototypes?”

The Third International Conference on Affective Computing and
Intelligent Interaction (ACII 2009)
September 10-12, 2009
Amsterdam, the Netherlands

Sponsored by HUMAINE Association and University of Twente
Technically Co-Sponsored by IEEE

*** Submission Deadline — March 23, 2009 ****

The conference series on Affective Computing and Intelligent Interaction is
the premier international forum for state of the art in research on
affective and multimodal human-machine interaction and systems. Every other
year the ACII conference plays an important role in shaping related
scientific, academic, and higher-education programs. This year, we are
especially soliciting papers discussing Enabling Behavioral and
Socially-Aware Human-Machine Interfaces in areas including psychology and
cognition of affective and social behaviour in HCI, affective and social
behaviour analysis and synthesis, affective and social robotics. General
conference topics will include:

* Recognition & Synthesis of Human Affect
(face/ body/ speech/ physiology/ text analysis & synthesis)
* Affective & Behavioural Interfaces
(adaptive/ human-centered/ collaborative/ proactive interfaces)
* Affective & Social Robotics
(robot’s cognition & action, embodied emotion, bio-inspired architectures)
* Affective Agents
(emotion, personality, memory, reasoning, and architectures of ECA)
* Psychology & Cognition of Affect in Affective Computing Systems
(including cultural and ethical issues)
* Affective Databases, Evaluation & Annotation Tools
* Applications
(virtual reality, entertainment, education, smart environments and biometric applications)


Accepted papers will be published by IEEE Xplorer.


Paper submission will be handled electronically. Authors should prepare an
Adobe Acrobat PDF version of their full paper. All submitted papers will be
judged by at least three referees. Papers must be formatted using IEEE
Authors’ Kit ( .


March 23, 2009: Deadline for submission of regular papers.
April 27, 2009: Deadline for submission of extended abstracts for demos.
June 1, 2009: Acceptance notification
July 1, 2009: Final camera-ready papers due in electronic form.

The 2009 conference will be held in Amsterdam, The Netherlands, in De Rode
Hoed, a former Remonstrant church built in 1616 and located in the heart of
Amsterdam’s historic district. One of the leading cultural centers of
Europe, with one of the continent’s largest historical inner cities,
Amsterdam has breathtaking architecture, an extensive web of canals and side
streets, and many world-renowned museums and cultural attractions. The city
offers a wide range of accommodation, from luxury hotels to modest hostels,
and is easily accessible from the Amsterdam International Airport.


We are looking forward to receiving your valuable contributions!

Best wishes,

Jeff Cohn, Anton Nijholt, and Maja Pantic
ACII’09 General Chairs


Jianhua Tao and Kostas Karpouzis

Akshathkumar Shetty, the developer of Quickserver, upon which RUBIOS is built, is going to help us out to accelerate the message passing performance of RUBIOS.

Here is an interesting new journal to publish research on autonomous robots

The MPLab and Hanson Robotics collaborated to demonstrate an interactive robot head. The system was presented at the TED conference on Feb 4, 2009.

Paul and NIck hanging out with Bob and Albert

Paul and NIck hanging out with Bob and Albert


Francisco Lacerda, a professor of phonetics at Stockholm University, is one of two scientists threatened with legal action after the publication of a scientific article condemning the use of lie detectors. The Israeli company Nemesysco, which manufactures detectors, has written in a letter to the researchers’ publishers that the researchers may be sued for libel if they continue to write on this subject in the future.


Toyota has patented a driver drowsiness detector. It looks like it is based on analysis of the driving itself, not computer-vision based analysis of the driver.

I have been reading the overview of the Willow Garage Robot Operating System. If it lives up to its promise, it sounds like it almost perfect for our needs, and we may want to consider trying it for our latest projects.

It has the following advantages:

  • Cross platform
  • Cross language (currently C++/Python, Java coming “soon”)
  • Publish/Subscribe architecture
  • Nodes talk directly to each other
  • Multiple data passing methods (tcp/udp/shared memory) as appropriate
  • Automatic bookkeeping of all publishers/subscribers

An overview is available:

Sixth Canadian Conference on Computer and Robot Vision (CRV’09)
Kelowna, British Columba, CANADA
May 25-37, 2009
will be held on May 25-27, 2009, in Kelowna, British Columbia, Canada.
Held jointly with the Graphics Interface 2009 (GI) and the Artificial
Intelligence 2009 (AI) conferences, a single registration will permit
attendees to attend any talk in the three conferences (CRV, GI, AI),
which will be scheduled in parallel tracks.

CRV seeks contributions of complete, original research papers on any
aspect of computer vision, robot vision, robotics, medical imaging,
image processing or pattern recognition. CRV provides an excellent
environment for interdisciplinary interaction as well as for
networking of students and scientists in computer vision, robotic
vision, robotics, image understanding and pattern recognition. In
addition to the regular sessions, there will be three invited
speakers. Four paper awards will be presented: one for the best
overall paper, one for the best paper with a student as first author,
and area awards for the best paper in vision and robotics.

For more detailed information, please consult the CFP at the following

Paper Submission Deadline 30 January 2009
Acceptance/Rejection notification 20 February 2009
Revised camera-ready papers due 6 March 2009

Program Co-Chairs
Frank Ferrie, McGill University
Mark Fiala, Ryerson University

More information about the conference can be found at the main site

Here (mplab blog) is a video of the technology used to superimpose the yellow line in american football games.

Ecamm Network has started producing wireless web cams that communicate to your computer via bluetooth. Resolution is 640×480, battery is 4 Hours, size is 2″x2.5″x0.625″, price is $150.

In the future, I expect these technologies will improve across all dimensions. We may want to keep our eye on them and decide when they become useful for us.

A new household robot created in Japan is capable of rinsing the dishes in the sink before neatly lining them up in the dishwasher and pressing the start button for the washing cycle.

The multi-jointed robot arm, created by scientists the University of Tokyo with the electronics company Panasonic, is one of a series of prototype devices designed to perform household chores.

Fitted with 18 delicate sensors, the kitchen assistant robot (KAR) is able to grasp delicate china and cutlery without dropping or breaking them in a palm-like device.

Using its internal camera along with the sensors, the robot is able to determine the shapes and sizes of dirty dishes and utensils placed in the sink before picking them up and loading them in the dishwasher.


Dear Colleagues,

NSF has issued a ‘Dear Colleague Letter’ that invites applications for
participation in a joint NSF/EPSRC “sandpit” (interactive workshop). The
sandpit is meant to be an intensive, interactive and free-thinking
environment, where participants from a range of disciplines immerse
themselves in collaborative thinking processes in order to construct
innovative approaches to synthetic biology. Substantial funding is
allocated for selected collaborative research projects arising from the

Synthetic Biology uses biological systems as the primary source of data,
dynamics, and phenomena to fabricate devices that are based on natural
living systems. For example, new tools for designing and controlling
neural circuits can lead to engineering of a virtual brain with the goal
of better understanding brain/behavior interactions and to new computer
technology based on our understanding of brain processes. Cognitive
science and neuroscience are essential to this goal.

Anyone eligible to apply for funding from either the NSF or EPSRC is
eligible to apply to attend the sandpit. Please read the Dear Colleague
Letter at and a fuller
description of the sandpit, its aim and desired outcomes at .
If you have questions, please contact Rita Teutonico
Senior Advisor for Integrative Activities

I talked to Jeremy Lewi last night.  He published a paper in NIPS 06 on infomax policies of generalized linear models with application to learning the receptive fields of sensory neurons.  He said that he was an admirer of the work that we do in our lab.  His paper this year builds upon the framework of the NIPS 06 paper by introducing prior information about the structure of the receptive fields.  One type of structure might be that a spatio temporal receptive field has rank 1 structure.  We can use this prior knowledge to adjust the posterior mode that we find in the update step of the infomax algorithm to balance likely receptive fields with ones that respect our prior beliefs about the structure of the receptive field.  Results for this method are given on bridsong data.  The results show that imposing a rank constraint on the receptive field gives a boost over the method from their 2006 paper.

8th IEEE International Conference on Development and Learning (ICDL2009)
Shanghai, June 5-7, 2009

ICDL is a multidisciplinary conference pertaining to all subjects
related to the development and learning processes of natural and
artificial systems, including perceptual, cognitive, behavioral,
emotional and all other mental capabilities that are exhibited by
humans, higher animals, and robots. Its visionary goal is to
understand autonomous development in humans and higher animals in
biological, functional, and computational terms, and to enable such
development in artificial systems. ICDL strives to bring together
researchers in neuroscience, psychology, artificial intelligence,
robotics and other related areas to encourage understanding and
cross-fertilization of latest ideas. Topics of interest include, but
are not restricted to:

(1) Biological and biologically inspired architectures and general
principles of development
(2) Neuronal, cortical and pathway plasticity
(3) Autonomous generation of internal representation, including feature
(4) Neural networks for development and learning
(5) Dynamics in neural systems and neurodynamical modeling
(6) Attention mechanisms and the development of attention skills
(7) Visual, auditory, touch systems and their development
(8) Motor systems and their development
(9) Language acquisition & understanding through development
(10) Multimodal integration through development
(11) Conceptual learning through development
(12) Motivation, value, reinforcement, and novelty
(13) Emotions and their development
(14) Learning and training techniques for assisting development
(15) Biological and biologically inspired thinking models and development
of reasoning skills
(16) Models of developmental disorders
(17) Development of social skills
(18) Philosophical and social issues of development
(19) Robots with development and learning skills
(20) Using robots to study development and learning

ICDL2009 will feature invited plenary talks by world-renowned
speakers, a variety of special sessions aligned with the conference
theme, pre-conference tutorials, as well as regular technical
sessions, and poster sessions. In addition to full-paper submissions,
ICDL 2009 accepts one-page abstract submissions to encourage
late-breaking results or for work that is not sufficiently mature for
a full paper.

Organizing Committee

General Chair:
Juyang Weng, Michigan State University, USA

General Co-Chairs:
Tiande Shou, Fudan University, China
Xiangyang Xue, Fudan University, China

Program Chairs:
Jochen Triesch, Frankfurt Institute for Advanced Studies, Germany
Zhengyou Zhang, Microsoft Research, USA

Publication Chair:
Yilu Zhang, GM Research, USA

Publicity Chair:
Alexander Stoytchev, Iowa State University, USA

Publicity Co-Chairs:
Hiroaki Wagatsuma, RIKEN, Japan (Asia)
Pierre-Yves Oudeyer, INRIA Bordeaux – Sud-Ouest, France (Europe)
Gedeon Deák, University of California at San Diego, USA (North America)

Local Organization Sub-Committee:
Hong Lu (Chair), Rui Feng (Hotel), Cheng Jin (Web), Yuefei Guo (Finance),
Wenqiang Zhang (Publication)

Important Dates

January 25, 2009: Special session and tutorial proposals
February 8, 2009: Full papers
April 19, 2009: Accept/Reject notification for full papers
April 26, 2009: One-page poster abstracts
May 3, 2009: Accept/Reject notification for poster abstracts
May 10, 2009: Final camera-ready papers


IEEE Computational Intelligence Society
Cognitive Science Society
Microsoft Research

— CALL FOR PAPERS – Deadline for submissions (2nd call): 19 December 2008 —

Barcelona, Spain, 26 to 28 February 2009

* Keynote Speakers (confirmed):
Professor Angela McFarlane, Director of Content and Learning at the Royal Botanic Gardens, Kew, UK
Professor Hiroaki Ogata, Dept. of Information Science and Intelligent Systems, University of Tokushima, Japan

* Conference background and goals

User Created Content & Mobile Technologies: From Consumers to Creators bypassing the Learning opportunity?

Over the pass three years Mobile and Social technologies have featured strongly in the Horizon Report series which examines emerging technologies likely to have an impact on teaching, learning. Mobile devices have progressed from an adoption
projection of two to three years in 2006, to a much more imminent adoption prediction trajectory of a year or less in 2008.
Whether earlier the educational value of mobile technologies was thought to be delivery of content to people’s devices, the emphasis now has clearing changed to focus on their capabilities that enable users creating and sharing content.

The ‘former audience’ combines traditional activities such as searching, reading, watching and listening, with producing, commenting, sharing, and classifying its own content. New genres of filmmaking and photography where the message gains ground over the form are developing. The proliferation of user-created content is fuelled by the wide availability of at-hand mundane technology such as mobile telephones, and the wider broadcasting outlets. These are mainly web-based however increasingly user-created content such as videos of breaking news stories feature in traditional broadcasting channels as for instance television.
The increasing range of web 2.0 and mundane technology choices, facilitating the development of user-created content and providing opportunities to meet and collaborate, offers immense potential for teaching and learning. However, the danger remains that the transition from consumer to creator might miss the learning opportunity.

The IADIS Mobile Learning 2009 International Conference seeks to provide a forum for the discussion and presentation of mobile learning research. In particular, but not exclusively, we aim to explore the transition from content consumer to
content creator in experiences that take advantage of the learning opportunities this provides.

* Format of the Conference
The conference will comprise of invited talks and oral presentations. The proceedings of the conference will be published in the form of a book and CD-ROM with ISBN, and will be available also in the IADIS Digital Library (accessible on-line).
The best paper authors will be invited to publish extended versions of their papers in the IADIS Journal on Computer Science and Information Systems (ISSN: 1646-3692) and also in other selected Journals.

* Types of submissions
Full and Short Papers, Reflection Papers, Posters/Demonstrations, Tutorials, Panels and Doctoral Consortium. All submissions
are subject to a blind refereeing process.

* Topics

We invite researchers, practitioners, developers and all those working in
the mobile learning arena to submit work under the following topics:
– Pedagogical approaches and theories for mLearning
– Collaborative, cooperative, and Contextual mLearning
– Creativity and mLearning
– Gaming and simulations in mLearning
– mLearning in educational institutions: primary, secondary and third level
– Informal and Lifelong mLearning
– New tools, technologies, and platforms for mLearning
– User Studies in mLearning
– The social phenomenon of mobile devices and mLearning
– mLearning in developing countries
– Speculative ideas in mLearning: where next?

* Important Dates:
– Submission deadline (2nd call): 19 December 2008
– Notification to Authors (2nd call): 19 January 2009
– Final Camera-Ready Submission and Early Registration (2nd call): Until 2 February 2009
– Late Registration (2nd call): After 2 February 2009
– Conference: Barcelona, Spain, 26 to 28 February 2009

* Conference Location
The conference will be held in Barcelona, Spain.

* Secretariat
Rua Sao Sebastiao da Pedreira, 100, 3
1050-209 Lisbon, Portugal
Web site:

* Program Committee

Mobile Learning 2009 Program Chair
Inmaculada Arnedillo Sánchez, Trinity College Dublin, Ireland

Mobile Learning 2009 Conference Chair
Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal

Steering Committee
Agnes Kukulska-Hulme, Open University, UK
David Parsons, Massey University, New Zealand
John Traxler, University of Wolverhampton, UK
Mike Sharples, University of Nottingham, UK

Committee Members: *
* for committee list please refer to

* Co-located events
Please also check the co-located events:
e-Society 2009 ( – 25-28 February 2009
Information Systems 2009 ( – 25-27 February 2009

* Registered participants in the Mobile Learning’ conference may attend e-Society and Information Systems
conferences’ sessions free of charge.

This mailing system may only be used for sending permission based email.
If you did not give permission to receive emails from this sender, please notify us.

This email was sent to by | Print / PDF version | Read our Privacy Policy.

We apologize if you receive multiple copies of this Call for Papers.


Elsevier Computer Vision and Image Understanding (CVIU)
Call for Papers
Special Issue on ‘Multi-camera and Multi-modal Sensor Fusion’


– deadline: January 10, 2009
– submission: – Select ‘Special issue: Sensor Fusion’


Theme of the special issue

Advances in sensing technologies as well as the increasing availability of computational power and efficient bandwidth usage methods are favouring the emergence of applications based on distributed systems combining multiple cameras and other sensing modalities. These applications include audiovisual scene analysis, immersive human-computer interfaces, occupancy sensing and event detection for smart environment applications, automated collection, summarization and distribution of multi-sensor data, and enriched personal communication, just to mention a few. This special issue proposes to address the principal technical challenges in vision processing when the video modality is also supported by other inputs such as audio, speech, context, depth sensors, and/or other cameras. Topics of interest to the special issue include:

– Multi-camera system algorithms and applications
– Multi-modal systems and data fusion methods
– Distributed sensing and processing methods for human-centric applications
– Distributed multi-modal scene analysis and event interpretation
– Automated annotation and summarization of multi-view video
– Automated creation of audiovisual reports (from meetings, lectures, sport events, etc.)
– Multi-modal gesture recognition
– Multi-modal human-computer interfaces
– Data processing and fusion in multi-modal embedded systems
– Context awareness and behaviour modelling
– Performance evaluation metrics
– Applications in distributed surveillance, smart rooms, virtual reality, and e-health


* Deadline for manuscript submission: January 10, 2009 (extended)

– First notification: April 10, 2009
– Revised manuscripts due: May 30, 2009
– Notification of final decision: July 15, 2009
– Camera-ready manuscript: July 30, 2009
– Publication of the special issue (tentative): 4th Quarter 2009

Submission Guidelines

– Papers must be submitted at by selecting the special issue option: ‘Sensor Fusion’

– The submission guide for authors:

– Only papers meeting the scope of the special issue will be considered for review. If in doubt concerning the relevance of a proposed paper, the authors are encouraged to contact the guest editors prior to paper submission.

Guest Editors

Andrea Cavallaro
Queen Mary, University of London, UK
[andrea.cavallaro –AT–]

Hamid Aghajan
Stanford University, USA
[hamid –AT–]

A Hierarchy of Temporal Receptive Windows in Human Cortex. Uri Hasson,1,2 Eunice Yang,1 Ignacio Vallines,3,4 David J. Heeger,1,2 and Nava Rubin1

Michael A. Long1 & Michale S. Fee (2008) Using temperature to analyse temporal
dynamics in the songbird motor pathway. Vol 456.

See link below to recent Nature Article on associative learning of social value.

Timothy E. J. Behrens1,2*, Laurence T. Hunt1,2*, Mark W. Woolrich1 & Matthew F. S. Rushworth1,2 (2008) Associative Learning of Social Value, Nature, Vol 456.


Google has introduced an in-browser video chat feature for its internet-based mail client, GMail. It seems to work quite simply and effectively across platform (I’ve used it for Mac-PC). It requires installation of browser plug-ins.

We may want to consider this as a platform for having meetings with collaborators.

More information here:

We are beginning a section of the MPLab blog to maintain a knowledgebase for the expertise we gain as we adapt our existing Machine Perception solutions to be more compatible with the OpenCV platform.

Call For Papers: Autonomous Robots – Special Issue on Robot Learning
Quick Facts
Editors: Jan Peters, Max Planck Institute for Biological Cybernetics,
Andrew Y. Ng, Stanford University
Journal: Autonomous Robots
Submission Deadline: November 8, 2008
Author Notification: March 1, 2009
Revised Manuscripts: June 1, 2009
Approximate Publication Date: 4th Quarter, 2009

Creating autonomous robots that can learn to act in unpredictable
environments has been a long standing goal of robotics, artificial
intelligence, and the cognitive sciences. In contrast, current
commercially available industrial and service robots mostly execute
fixed tasks and exhibit little adaptability. To bridge this gap,
machine learning offers a myriad set of methods some of which have
already been applied with great success to robotics problems. Machine
learning is also likely play an increasingly important role in
robotics as we take robots out of research labs and factory floors,
into the unstructured environments inhabited by humans and into other
natural environments.

To carry out increasingly difficult and diverse sets of tasks, future
robots will need to make proper use of perceptual stimuli such as
vision, lidar, proprioceptive sensing and tactile feedback, and
translate these into appropriate motor commands. In order to close
this complex loop from perception to action, machine learning will be
needed in various stages such as scene understanding, sensory-based
action generation, high-level plan generation, and torque level motor
control. Among the important problems hidden in these steps are
robotic perception, perceptuo-action coupling, imitation learning,
movement decomposition, probabilistic planning, motor primitive
learning, reinforcement learning, model learning, motor control, and
many others.

Driven by high-profile competitions such as RoboCup and the DARPA
Challenges, as well as the growing number of robot learning research
programs funded by governments around the world (e.g., FP7-ICT, the
euCognition initiative, DARPA Legged Locomotion and LAGR programs),
interest in robot learning has reached an unprecedented high point.
The interest in machine learning and statistics within robotics has
increased substantially; and, robot applications have also become
important for motivating new algorithms and formalisms in the machine
learning community.

In this Autonomous Robots Special Issue on Robot Learning, we intend
to outline recent successes in the application of domain-driven
machine learning methods to robotics. Examples of topics of interest
include, but are not limited to:
• learning models of robots, task or environments
• learning deep hierarchies or levels of representations from sensor
& motor representations to task abstractions
• learning plans and control policies by imitation, apprenticeship
and reinforcement learning
• finding low-dimensional embeddings of movement as implicit
generative models
• integrating learning with control architectures
• methods for probabilistic inference from multi-modal sensory
information (e.g., proprioceptive, tactile, vision)
• structured spatio-temporal representations designed for robot
• probabilistic inference in non-linear, non-Gaussian stochastic
systems (e.g., for planning as well as for optimal or adaptive
From several recent workshops, it has become apparent that there is a
significant body of novel work on these topics. The special issue will
only focus on high quality articles based on sound theoretical
development as well as evaluations on real robot systems.

Time Line
Submission Deadline: November 8, 2008
Author Notification: March 1, 2009
Revised Manuscripts: June 1, 2009
Approximate Publication Date: 4th Quarter, 2009

Inquiries on this special issue should be send to one of the editors
listed below.

Jan Peters (
Senior Research Scientist, Head of the Robot Learning Laboratory
Department for Machine Learning and Empirical Inference, Max Planck
Institute for Biological Cybernetics, Tuebingen, Germany

Andrew Y. Ng (
Assistant Professor
Department of Computer Science, Stanford University, Palo Alto, USA_______________________________________________
Comp-neuro mailing list

I just read a news item that Picasa is giving users the ability to use creative-commons licenses for their images. The original news story is here (note that Nick in the story’s comments is not me):

Perhaps we should explore this as potential future source for object recognition datasets?

Sharon Leal and Aldert Vrij , University of Portsmouth
Published on Journal of Nonverbal Behavior.

Though I did not read this paper (UCSD does not purchase the online version). This article received pretty good media coverage (Telegraph).

They reported that people telling a lie  (and afterwards) has substantial higher (8x) blinking rate  than normal situation. They use some electronic devices attached to the eye to measure the blink rate.

BBC created a website for visitors to test their ability in discerning true versus fake smiles in 20 videos. The test is based on Paul Ekman’s research.

I get 18 out of 20 correct, but I was cheating since I know the key AU that makes the difference. 

Appeared in SIGGRAPH 2008. This can be used to de-identification of faces in google street views.

They use OMRON face/feature detectors. I think we have to release (at least) our feature detectors to stimulate more creative applications before ours lose the advantage in the market. Is there anyway we could get a copy of these commercial detectors and do some benchmark?


Tutorial on Conditional Random Fields

Albert Tarantola (2004) Inverse Problem Theory for Model Parameter Estimation

New Developments on Autism and Early Learning

Another version of the same story

To browse through the CVPR 2008 Conference DVD click here.

To browse through the CVPR 2008 Workshops CD-ROM click here.

Chronicle of Higher Education discusses research on facial expression recognition for ITS.

BBC Leading Edge Report on the TDLC/LIFE collaboration to use RUBI as a second language teacher. Broadcasted June 19 2008. feature on Jake Whitehill’s Project on Learning to Teach


The GENKI-4K dataset, containing faces, and expression and pose labels, was released. Check out the GENKI-4K webpagefor information on downloading it.

Josh Susskind, an MPLab Alumni just published a Nature Neuroscience paper showing how expression of emotion enhances sensory acquisition. ( A link to the article is available at the MPlab’s Web Site).

Adam White, from the University of Alberta reported problems compiling the Phidgets. Turned out the rubios.jar archive was outdated. Problem was solved when the more recent rubios.jar file was included in the distribution.

RUBIOS 2.0, the open source social robotics API developed as part of the RUBI project has been released for public use.

Click here to download.

Tuesday May 20
9-11:30 am
Marjo, Micah, Javier

– 8 children interected with Rubi
– Malfunctions: 1) problems with connection, 2) long delays in playing sounds , 3) full-screen pictures, 4) wrong picture (bottle)
– BBC radio interview
– Children repeated many Finnish words
– One child repeated also a two-word sentence (“Give me…”)

Tuesday May 20
3:15-4:30 pm

– 7 children interacted with Rubi
– Children played a new game (names game) approx. 10 min. in the beginning. After that only few interactions with Rubi.
– Dancing while Rubi played songs, giving items (soft blocks)
– One child repeated Finnish word pallo during the item game
– Children read books next to Rubi without paying attention to her. Rubi catched the attention by saying “no-no” and shaking the head.
– Malfunction: Rubi played “no-no” and shaked the head during nap time

Wednesday May 21
9-10 am
Micah, Marjo

– 3 children interacted with Rubi
– Wrong object changed and problems with full-screen pictures fixed
– Major problems with connection –> training cancelled

Using different instructional material based on

1) children’s age: more songs and one word per training for younger children, songs and educational games for older children
2) the state of training: at first songs to get familiar with new words, then educational games

Friday May 16, 2008

– no malfunctions in the autonoumous mode
– 5 children interacted with Rubi
– less interaction with Rubi than earlier

Monday May 19, 2008
9-10 am
Marjo, Micah

– 8 children interacted with Rubi
– malfunctions: 1) connection lost several times, 2) full-screen picture
– some of the children repeated Finnish words
– one child chose right object without looking at the screen


Thursday May 15, 2008
3:15-4:30 pm

– 9 children trained with Rubi (no teacher)
– No malfunctions in the autonomous mode
– The children gave soft blocks to Rubi, not the test items
– Children played with Rubi several times, alone or in the group
– At least one child repeated Finnish words, “kenkä” (shoe)

Friday May 16, 2008
9:00-10:45 pm
Marjo, Micah

– 6 children trained with Rubi
– Teacher guided 2 of the children by saying e.g. Rubi wants something. This kind of instruction seemed to work.
– Some minor malfunctionsin the control mode: losing connection and showing an item on full screen a couple of times
– Children hadn’t played with Rubi in the morning –> didn’t change their interest in Rubi.
– Some of the children repeated Finnish words, “ei” (no), “juna” (train)
– Teacher suggested different variations on interaction, e.g. 1 item/day, 1item/child (if two child are playing–>2 trains etc.)

Wednesday May 14, 2008
3:20-4:30 pm

– 8 children (no teacher)
– autonomous mode ok
– no malfunctions in games or robot
– giving items to RUBI is still difficult to the children
– cover removed from the cup

Thursday May 15, 2008
9:00-10:15 am
Marjo, Micah

– children played outside in the morning instead of playing with Rubi (8-9am) –> no influence on the controlled training session, e.g. the children didn’t show more interest in Rubi
– 8 children (no teacher)
– Some of the children played with Rubi several times
– 3 or more children with Rubi at the time was too much (this happened at the end of the session)
– minor malfunctions: 1) connection between Rubi and control panel was lost a couple of times, 2) one item was shown once on full screen

Tuesday May 13, 2008
3:45-5:00 pm

– 6 children (no teacher)
– item game and songs were not running
– autonomous mode ok after restarting RUBI
– parents in the classroom at the end of the session

Wednesday May 14, 2008
9:30-10:30 am
Marjo, Micah

– 6 children (no teacher)
– control mode (names, song, item game, song)
– no malfunctions, also hands were working well
– 2 children played with Rubi at the same time
– approx. 10 min./session
– children had already played with Rubi in the morning from 8 to 9 –> some of the children didn’t want to play again

Monday May 12, 2008
3:30-5:00 pm

– 5 children (no teacher)
– autonomous mode (songs, object game, hands)
– no malfunctions in RUBI
– suggestions from the teacher:
1) repeating a word e.g. for 5 minutes?
2) playing music autonomously (without touching the screen)?

Tuesday May 13, 2008
9:20-10:00 am
Micah, Marjo

– 6 children (no teacher)
– 2 children played with RUBI at the same time
– approx. 8-10 min. / 2 children
– control mode
– automatic hands
– structure: names, song, items, song
– object game in the control panel didn’t work
– automatic hands stopped working
– connection between control panel and RUBI was lost several times, but managed to connect again

Friday May 9, 2008
10-11 am
Marjo, Micah

– 4 children (+1 teacher)
– control mode
– no malfuntions while playing with RUBI
– videotaping ok (3 children)
– position of RUBI is problematic (difficulties in seeing especially RUBI’s hands from the observation room –> difficulties in right timing i.e. closing hands)
– teacher used English and Finnish while training
– add Finnish game to the control panel
– more cameras needed
– 2 arms to work at the same time
– do not reject an item, but play a new sound
– do structured playing: name, song, Finnish game, bye-bye

Friday May 9, 2008
3:30-> pm
– installing 4 cameras

Monday May 12, 2008
9:30-10:30 am
Marjo, Micah, Javier

– 4 children (+ 1 teacher)
– control mode
– videotaping ok (4 children)
– connection between RUBI and control panel lost several times
– finnish game in the control panel didn’t work
– dance is too long
– need to close hand faster
– finish a RUBI session by playing a song
– more child-centered than teacher-directed activities with RUBI
– give new instructions to teachers

ECEC Room 1
May 5, 2008
9:30-10:45 am
Marjo and Micah

– Testing 3 children, who were not tested on May 2
– Testing 5 children, who played less than 3 rounds on May 2
– Camera added for recording facial expressions
– Logger ok, camera ok, no malfunctions in the pre-test game
– A child may have selected a new player while he was playing
– Multiple skips mean the child lost interest and the teacher tried to end the game

ECEC, Room1
May 2, 2008
9-10 am

Instructions to teacher:
1) Tell children to use fingertip
2) Use the skip code (tapping right corner 3 times), if needed
3) Children should play ideally 3 rounds or more.
4) Avoid looking at the screen
5) Encourage the child to play
6) Teacher can repeat questions

Practice trials with the teacher

Pre-test notes:
– 6 children were tested
– 3 children, who were not at ECEC today, are tested on Monday May 5
– Children did from 1 to several (more than 3) trials
– Game and logger worked well. No malfunctions.
– Videotaping and observations from the observation room. Only the teacher and a child in the test room.
– Videotaping children’s facial expressions would be valuable to catch different expressions between English and Finnish words
– Some children used finger nail, which does not work, or touched the screen too lightly
– Game design: Bee is distracting to some of the children
– Some of the children are bilingual->might be good to collect data about children’s native languages.
– Last trial by Micah


Deadline:  June 15


MIT Media Laboratory: The Human Speechome Project

Stepping into a Child’s Shoes 

Yes, it’s an article by Apple meant to highlight the use of Macs in Science, but it’s interesting all the same.

Monday April 21, 2008

Morning (10am)
– Preparing and turning RUBI on for testing
– No malfunctions in RUBI

Afternoon (2pm – 4:30 pm)
– Introducing RUBI to children
– Children plays and interact with RUBI
– No malfunctions in RUBI
– Video taping ok
– RUBI Log given to teachers

Tuesday April 22, 2008
Morning (10 am)
– Teacher-directed activities with RUBI
– Children play with RUBI approx. 30 min.
– No malfunctions in RUBI
– Video taping ok

Afternoon (3 pm-5 pm)
– Child-centered playing with RUBI
– No malfunctions in RUBI
– Video taping ok
– Video clips available

Wednesday April 23, 2008
Morning (9:30 am – 10:15am)
– Teacher-directed activities with RUBI
– Children play with RUBI approx. 45 min.
– No malfunctions in RUBI
– Video taping ok

N/A: RUBI at UCSD for NSF Site visit

This content is password protected. To view it please enter your password below:

Youtube video on our research on using automatic facial expression recognition for intelligent tutoring systems: 

An article from the New York Times:

Please refer to the article if my explanation is unclear.

The basic idea of the article is that “cognitive dissonance” methodology is flawed in that it ignores the Monty Hall effect, and calls people “irrational” when really people are exhibiting Monty-Hall congruent preferences.

Let’s say you have three choices that you should value equally. Presented with choices red and blue, you choose red arbitrarily. Now I give you a choice between blue and green, and you choose green about 2/3 of the time. Assuming that people are following a probability matching strategy (and there are realistic conditions under which it is an ideal strategy), this means you think green has 2/3 chance of being better than blue, whereas before you thought each was equally likely to be better than the other. Psychologists call this “rationalizing your initial (arbitrary) rejection.”

In contrast, the new argument goes “If blue was worse than red, there’s a 2/3 chance that it’s worse than green, because green could be better than red and blue, or worse than red but better than blue. In 1/3 of outcomes blue is worse than red but better than green”. This can be seen as a variant of the Monty Hall problem, where the odds change conditioned on the first choice.

I am not sure whether to be convinced by this new argument, much as I enjoy seeing the agenda of savvy statistical methods being evangelized. It seems to me that if you carried around full Bayesian posteriors, the effect would vanish.

Dr. Chen’€™s analysis as it’s presented in the New York Times article seems flawed in a way that unfortunately too many things are flawed: it takes the mode of a probability distribution to be representative a distribution itself. That is, it assumes that the probability at the mode is 1 and 0 elsewhere. In literature this is called a “€œmaximum a posteriori€” (MAP) strategy.

If people were proper statisticians, they would carry around with them a notion of uncertainty. They would say to themselves on picking red over blue “€œI think red is better, but I’€™m completely unsure about that”€. Then when faced with the choice of blue and green they would say “I thought red was better than blue, but I was completely unsure about it, so maybe blue really is better than red”€. If red was better than blue, then 2/3 of the time green will be better than blue. If blue was better than red, then 2/3 of the time green will be better than red. Weighing these outcomes by their uncertainty (50/50) gives that green is just as likely to be better than blue, not 2/3 as Dr. Chen suggests.

[The math is the same for preferences, so Dr. Chen’€™s argument for “€œslight preferences”€ is not sufficient]

Dr. Chen’s argument depends on people *forgetting* information, specifically about the certainty of their previous choices. While this is a good candidate explanation for what’s going on, it’s not evidence for people doing “€œthe right thing.”€

You can verify this yourself in simulations. Randomly assign red, green, blue the numbers “1/2/3”. Have the simulation ask you whether red is better than blue, and pick what you like. Then have the simulation give you a choice between the other two. 50% of the time, you will be wrong. This is in stark contrast to the real Monty Hall problem, where in simulations you will find yourself happily winning 67% of the time.

Critically, the Monty Hall problem relies on revelation of new information (€”The car is not behind door number 2″) in the context of the previous choice. The choice itself is not sufficient, and gives no new information. If Monty Hall didn’t show you the goat, you’d be at 50/50. Ask yourself, in this case of cognitive dissonance, what new information is being revealed.

Note that if after you chose red over blue I told you “You were right, red was better than blue”, then the probability of green being better than blue goes up to 2/3. This “You were right” information that is conditioned on your previous choice is the key to making this a Monty Hall style problem. Since people get no such feedback, they don’t know if they were right to choose red or wrong, and so the analysis shouldn’t apply.

Company doing smile detection and intensity rating.  Link to article” 

Slashdot links to an article here: that is relevant to us, and probably good general advice. The idea is something like, having more potentially helpful features is better than having a better algorithm. The article is about how augmenting the Netflix Dataset with simple information collected from IMDB leads to top-tier performance with very simple algorithms.

Workshop illustrating the use of robots on education


Apple just released their iPhone SDK. Under the terms, we can distribute software freely through apple. We might want to consider making an application similar to “cheese” for the iphone that a) rates your smiles, and b) takes your picture when you are smiling.  Should we try to have something ready by the release of iPhone apps (June)? You can download the sdk at: 

iShowU (link: ) is a really amazing piece of Mac software. It lets you record video from a region of your screen that you choose (or it can follow your mouse). As it’s recording, it compresses the video on-the-fly and makes really reasonably sized files.  I have been using this program to make videos of other software I have made for the purpose of demoing them in my keynote presentations, without having to exit the presentation and open another piece of software.The software is very reasonably priced ($20), and is really well put together. I was skeptical before I tried it, thinking I would have to manually crop or compress the video afterward. My expectations were hugely exceeded. 

The NSF announced 14 grand engineering challenges, and the public gets to vote on how they should be ranked in terms of importance. Check out the article and vote for you favorite. Article.

Small article about handling many camera feeds, but only displaying the important video. I thought it was interesting when considering how to mine large amounts of video for segments of interest. article

This site was recommended by a Cognitive Science student. It is a utility for converting images to vector graphics. It is useful for turning things like images of logos, figures, etc. that don’t scale well into vector graphics that do.

I haven’t yet tried it out, but you may want to at some point.

I recently discovered Matlab’s “sample” function, which I have implemented on my own several times before. There are always slightly annoying implementation details to work out, and it’s very nice to have a function that is standard to do it for me.

The idea is to sample from a multinomial distribution, which is something you need to do from time to time for various reasons. Here is an example of usage:

>> sample([.25, .5, .1, .15],1),  ans =  3
>> sample([.25, .5, .1, .15],1),  ans =  4
>> sample([.25, .5, .1, .15],1),  ans =  1
>> sample([.25, .5, .1, .15],1),  ans =  4
>> sample([.25, .5, .1, .15],1),  ans =  2
>> sample([.25, .5, .1, .15],1),  ans =  1
>> sample([.25, .5, .1, .15],1),  ans =  2
>> sample([.25, .5, .1, .15],1),  ans =  2

a = sample([.25, .5, .1, .15],10000);
>> nnz(a == 1),  ans =  2483
>> nnz(a == 2),  ans =  5027
>> nnz(a == 3),  ans =  988
>> nnz(a == 4),  ans =  1502

from article “Accurate face recognition is critical for many security applications. Current automaticface–recognition systems are defeated by natural changes in lighting and pose, which often affect face images more profoundly than changes in identity. The only system that can reliably cope with such variability is a human observer who is familiar with the faces concerned. We modeled human familiarity by using image averaging to derive stable face representations from naturally varying photographs. This simple procedure increased the accuracy of an industry standardface–recognition algorithm from 54% to 100%, bringing the robust performance of a familiar human to an automated system.”  Face Recognition Article

This year is in Monterrey, California
August 9th-12th, 2008.

Important dates:
Feb. 15 Special session proposals due
March 14 Full 6-page paper submissions due
March 21 Tutorial proposals due
April 14 Notification of accept/reject
April 18 1-page poster abstracts due
May 9 Camera-Ready Copy due


From Slashodot  Open Source Speech Recognition artical

AAAI may also be a decent venue for some of the projects done in mplab.  Here is a link to past conferences to get an idea: deadline for the 2008 conference is Jan 30, though some have abstracts due on the 25th.  Here’s the link: 

ICML and UAI have about the same audiences as NIPS.  I’m not sure what should make you choose ICML versus UAI.  They are opposite in schedule as NIPS: submit in winter, conference in summer.  So a lot of people basically make two conference trips per year.  This year UAI, ICML, and COLT are at the same place and location.Feb 8: ICML deadlineFeb 27: UAI abstract deadline, Feb 29 full submission. 

A group of university professors has written a letter of concern about the recent audit of the Preuss School.


The Association of California School Administrators Recently Sent Another Letter of Concern

Here is an interesting study showing that a $90 wine may taste beter than the same wine at $10

IEEE Workshop on CVPR for Human Communicative Behavior
Deadline for paper submissions is March 25th 2008

Stephen Boyd teaches a Linear Dynamical Systems course at Stanford. The course webpage is

The lecture notes are excellent, and are recommended for anyone interested in a crash course in Linear Dynamical Systems topics (LQR, Kalman Filter, etc.) Here is a list of topics covered:

1. Linear quadratic regulator: Discrete-time finite horizon
2. LQR via Lagrange multipliers
3. Infinite horizon LQR
4. Continuous-time LQR
5. Invariant subspaces
6. Estimation
7. The Kalman filter
8. The extended Kalman filter
9. Conservation and dissipation
10. Basic Lyapunov theory
11. Linear quadratic Lyapunov theory
12. Lyapunov theory with inputs and outputs
13. Linear matrix inequalities and the S-procedure
14. Analysis of systems with sector nonlinearities
15. Perron-Frobenius theory


These guys below presented a Kalman filter model of image motion. The amazing part is that the inputs were actual video, i.e., the observables were pixels. They had to learn the model parameters and then they used that to generate video. I was shocked by the results. There was video of a fountain and the kalman filter kept generating video that truly looked like the real thing. It only used 4 internal states. We really shall look into this as a potential way to model expressions.

  • Here is the reference. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems
    Sajid Siddiqi, Byron Boots, Geoffrey Gordon Download
  • A Constraint Generation Approach to Learning Stable Linear Dynamical Systems
    Sajid Siddiqi, Byron Boots, Geoffrey Gordon Download
  • Comparing Bayesian models for multisensory cue combination without mandatory integration
    Ulrik Beierholm, Konrad Kording, Ladan Sham
    s, Wei Ji Ma Download
  • Experience-Guided Search: A Theory of Attentional Control
    Michael Mozer, David Baldwin Download
  • Sequential Hypothesis Testing under Stochastic Deadlines
    Peter Frazier, Angela Yu Download
  • The rat as particle filter
    Nathaniel Daw, Aaron Courville Download
  • Congruence between model and human attention reveals unique signatures of critical visual events
    Robert Peters, Laurent Itti Download
  • Random Features for Large-Scale Kernel Machines
    Ali Rahimi, Benjamin Recht Download
  • SpAM: Sparse Additive Models
    Pradeep Ravikumar, Han Liu, John Lafferty, Larry Wasserman Download
  • Bundle Methods for Machine Learning
    Alex Smola, S V N Vishwanathan, Quoc Le Download
  • Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion
    J. Zico Kolter, Pieter Abbeel, Andrew Ng Download
  • A Game-Theoretic Approach to Apprenticeship Learning
    Umar Syed, Robert Schapire Download
  • Adaptive Online Gradient Descent
    Peter Bartlett, Elad Hazan, Alexander Rakhlin Download

Winner of the best student paper award.   The goal is to see if we can tease out an individuals underlying distribution of objects in a specific category.  For instance if x is a vector of joint angles and limb lengths and c is a category such as giraffe, can we estimate p(x|c).  This paper takes a very original approach using an MCMC method based on metropolis hastings.  It turns out that a 2 alternative forced choice model of  called the Luce decision rule where subjects choose option x versus option with with probability p(x)/(p(x) + p(y)) is identical to a particular Metropolis hastings acceptance rule due to Barker.  Therefore, we can treat a series of successive 2 AFC tasks as MCMC with a metropolis hastings update and a barker acceptance rule.  The stationary distribution of this will be the underlying distribution we want to recover (p(x|c)).Experiments were performed on trained data.  Subjects were able to recover gaussian distributions of various means and variances.Subjects were then asked to form a model of stick figures for specific animal categories.  The results show that the underlying distributions inferred look pretty similar to the animals in the category.

Tekkotsu is a opensource educational robotics platform developed at CMU. They design low cost robot prototypes as well as well designed C++ robotic library and working environment. So that students could learn to program a robot by writing very high level c++ code rather than dealing with vision motor control etc. themselves.

I asked the author’s opinion about MS robotic studio. He replied with two major drawbacks

(1) closed source

(2) the controller need to be run on a PC which is not convenient for mobile robots and communication between PC/robot may need substantial amount of time.

Very cool idea.  This builds on the work of Andrew Ng in who first introduced the idea of apprenticeship learning.  The idea is to learn from a teacher, but instead of simply imitating the teacher (which we can prove will give a similar reward under certain assumptions) we try to do better.  If we consider the reward function to be unknown but simplya linear combination of a set of known features of the state, then we can formulate the problem of learning an optimal policy in a game theoretic framework.

Our goal is to find a policy that maximizes the minimum reward over all possible weights on the state features.  It turns out there is an algorithm based on multiplicative weight updates due to Freund and Shapire that is after a finite number of iterations will converge to a stationary distribution over policies that in expectation is as good as the optimal policy.

One cool thing about this work is that we can use it in the setting where we don’t have a teacher to learn from.

Structure prediction is the case of classification problem where labels could be structures (ex. trees) rather than binary labels.

The traditional way for structure prediction is to break the structure into small pieces and the feed these pieces to a classifier. However, such “breaks” will also break the “structural relation” in the data. The structural prediction do take structure into account thus archives slightly higher accuracy.

In our cert project. If we treat the whole-face expression as a structure rather than individual AUs, that might fit into this framework.

software : SVM^{struct}

related model: conditional random field.

  • Filter Boost is a boosting approach designed to learn from a large population of examples, more than you can handle in memory. It combines a filter than selects samples from the population and a boosting-like algorithm. nips2007_0060.pdf

    Multiple Instance Active Learning

    -key idea: we have a series of bags that are either positive or negative.  Each bag contains a series of examples.  We know that each negative bag contains no positives and each positive bag contains at least one positive.  We assume the bag level labels are easy to obtain.  This work gives several strategies for selecting individual examples in the positive bags to query for labels.  These strategies are more or less heuristic, but the results are strong.  This is the same setup as the weakly supervised object detection problem.

    Learning Monotonic transforms:

    – really cool.   Simultaneous learn an svm and a monotonic transformation on the features.  These monotonic transforms can model saturation effects and other nonlinearities.

    Variational inference for network failures:

    – interesing application of variational inference.  Very similar to the idea of predicting a set of diseases from a set of observed symptoms.  The system is an expert system in that we use a noisy or model for the expression of a symptom given a disease where the noisy or parameters are given.  Some additional tricks are used such as putting beta priors on the individual diseases.

    Learning Thresholds for Cascades using Backward Pruning (Work with Paul Viola)

    A cool idea for picking thresholds.  Train a classifier on all examples.  At the end select all positives that are above a certain threshold and now train a series of cascades.  The threshold selected at each level of the cascade sould guarantee that none of the positives that would survive to the end are removed.

    Audio Tags for music: similar to work at UCSD except uses discriminative instead of generative framework.  Also, they test on a much harder dataset and use the tags to reproduce a notion of artist similarity induced by collaborative filtering.  The people who did this work are aware of the work at UCSD.

    Language recognition from phones:

    They put a phone recognizer as a front end to an n-gram model for predecting which language the speech is from (multiclass: e.g. english, spanish, german, etc.).  A pruning algorithm is used to prevent combinatorial explosion in the number of features.  Just thinking out loud, but is this a possible justification for the loss of discrimination of certain phones that are not part of your native language?


  • Receding Horizon DDP: Tom Erez gave a version of this talk to Emo’s lab just before Neuroscience. Yuval Tassa said he may be going to work for Emo. It seems like really good work. The situation is continuous state space. The idea is to use Emo-like methods to start with a trajectory (open loop policy) and iteratively estimate the value function around it, and improve it to a local optimum. Fill the space with a small number of open loop trajectories, and then for closed loop control, always figure out which learned trajectory you are near and follow that. They worked with 15 and 30 dimensional continuous spaces, which apparently is quite high. Receding Horizon is appropriate where you have an implicit value gradient, or at least the local value function is in line with the global value function. In practice I would guess that this is quite often the case. This is something I would want to replicate. The application was a swimming world, which we should look into as a test of algorithms, etc.
  • Random Sampling of  States in Dynamic Programming: Chris Atkeson is apparently a famous control guy at CMU. Tom Erez was very respectful to him. Atkeson’s poster was about discretizing state space using randomly sampled points and some heuristics about whether to keep the points. The points define a voronoi diagram, each with a local linear dynamics, quadratic value function estimator. This is something I woulld like to replicate. The application was up-to-4-link inverted pendulum and swing up problems. Atkeson claims to  be the only one he knows of with such success on 4-link inverted pendulums. This is another system we should test algorithms on.
  • Bayesian Policy Learning with Trans-Dimensional MCMC: “Trans-dimensional” means probability distributions defined on variable-length random variables through statistics of those random variables. These guys were using probability functions defined on trajectories to do trajectory sampling. It seemed like a fancy way to do policy improvement, without much benefit over other methods for doing the same thing.
  • Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs: “Regret” means “difference between optimal average reward and obtained average reward.” It is a common way to measure the loss incurred by exploration of the MDP, and is a currently important topic to RL people. Peter Bartlett showed that a good policy that yields low regret is to keep confidence intervals on the value function and to act as if you will receive the upper bound of your confidence interval. Based on your confidence interval and the amount of time spent acting, there are provable regret bounds.
  • The Epoch-Greedy Algorithm for Multi-armed Bandids with Side Information: The “contextual bandit problem” is a weird MDP where actions affect reward but not state transitions (you float around the world among states X, depending on x you have different reward distributions). Based on the complexity of the learning problem (how well and simply X predicts reward), things can be proven about a strategy “act randomly (uniform) / act greedily” based on how many experiences in this state you have.
  • Learning To Race by Model-Based Reinforcement Learning with Adaptive Abstraction: The Microsoft Applied Games group applied reinforcement learning to a car-driving game. To get it to work, they came up with fancy heuristic ways for discretizing state space on the fly, and combined it with fancy online dynamic programming methods (prioritized sweeping). Eric Wiewiora was very impressed. He said it was one of the first compelling applications of real reinforcement learning that he’d seen. I talked to these guys a bit about applying Emo-style continuous-state-space optimal control to their problem, and they were interested.

Check out these papers if you get a chance:

J. Zico Kolter, Pieter Abbeel, Andrew Ng, “Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion.”  Advances in Neural Information Processing Systems. 2007. [ps.gz] [pdf] [bibtex] [slide]

Umar Syed, Robert Schapire, “A Game-Theoretic Approach to Apprenticeship Learning.” Advances in Neural Information Processing Systems. 2007. [ps.gz] [pdf] [bibtex] [supplemental] [slide]



Luis Von Ahn gave a fresh talk about using computer games whose product is that humans label images and sounds for free. Lots of things to learn about what Luis is doing.

  • Here is a wikepedia link to Luis
  • Here is a link to the ESP game
  • Cool relationship between directed sigmoid belief nets and undirected boltzmann machines. Particularly the learinng rule for sigmoid belief nets becomes the Boltzmman rule if you use an infinite set of directed layers with linked weights.
  • Keeps emphasizing the form of independence that restricted Boltzmann macines are good at: conditional independence of the hidden units given the observable vector. This reverses the standard ICA approach where you have unconditional independence and the Conditional Random Fields. He called this generative conditional random fields and conditional RBMs.
  • Conditional RBMs from Hinton and Sutskever. Designed to deal with dynamicsl data. Used by Taylor Roweis and Hinton 2007 for generating skeletal animation models.
  • NIPS 2007 paper on learning the orientation of a face using deep belief networks combined with Gaussian Processes
    Here is the Paper
  • Bengio et al 2007 has a paper on how to extend RBM’s to the exponential family.
  • Hinton paper on Stochastic Embedding Using
  • Lots of deep autoencoders pretrained one layer at a time using the Boltzmann machine algorithm then finetuning it using backprop
  • salakhuditnov and Hinton -> Semantic hash mappings.

    Leon Bottou is pushing slightly sophisticated gradient descent methods over batch methods for the case in which you want to make use of tons of data. He shows that they can be much quicker without loss of accuracy, esp. in the cases people like these days (CRFs, convex optimizations, etc.). It may be worth looking into these current gradient methods.

  • This paper is useful to justify the multiple view based approach to multipose face detection. Logothetis, N. K., Pauls, J., and Poggio, T. (1995). Shape recognition in the inferior temporal cortex of monkeys. Current Biology, 5:552–563. It shows existance of scale and location invariant but pose dependent object detection neurons in IT.
  • The human brain has as many neurons as 1 million flies.
  • Anterior IT face cells paper: Desimone et al 1984. We should get a copy of this classic paper.
  • Hung Kreiman Poggio and Dicarlo 2005: After 90 msecs from stim presentation information appears in neurons in IT for object category detection. Information peak by 125 msec.
  • Nature Neuroscience 2007 paper by Anzai Peng Van Essen shows modern data on V2 It may justify the idea of orientation histograms
  • Fabre-Thorpe has very cool experiments on human rapid recognition of visual categories Link to her lab
  • Knoblich Bouvire Poggio 2007. Shunting inhibition model of multidimensional Gaussian
  • Link to Poggios’s group PAMI 2007 paper on bio inspired object recognition
  • Jhuang Serre Wolf and Poggio ICCV paper on recognizing activities. Get a copy.
  • Prefontal cortex may prime the features that are most important for a particular task. LIP would compute a saliency map.

    Roby Jacobs is working on some really cool work on motor control.

  • He described an experiment in which subjects had to control a joystick under differerent noise regimes: no noise, signal proportional noise, signal inverse proportional noise. The optimal trajectories are very different in these 3 cases and humans seem to be quite good at learning these trajectories. Here is a link to the paper
  • He talked about an experiment by Julia something from Germany were the task is reaching under different pay off regimes.
  • He talked about experiments on combination of visual and proprioceptive information. Cited the work by a person at UCSF.
  • He talked about some computational work he is doing in which optimal trajectories are computed as linear combinations of a library of trajectories. This approach may simplify the optimization problem so it becomes a standard gradient descent like problem as opposed to a variational problem.
  • Here is a link to Roby’s work
  • Mike talked about the problem of why we have ganglian center-surround cells if Gabor happened to be the most efficient code. He showed that by adding a very small weight connectivity coinstraint you end up with Ganglian cells instead of Gabors. This begs the question of why we have Gabors in V1.
  • Mike presented a model that attempts to lean a universal dictionary of texture. It is akin to a hierarchical ICA model. Here is a link to the paper:
    karklin-lewicki-nc05-preprint.pdf It may be interesting to generalize this idea to auditory “textures”.
  • Mike presented his cool work on coding auditory signal’s and Gammatones as optimal encoders. He also seems to have extended the work to learn to detect auditory scenes. You can see his work by following this
    link to Mike’s Research