I just watched an interesting TED talk on "The Illusion of Consciousness". He basically discusses optical illusions, and how our brain fills in detail without us realizing it.
As a commentary on the talk itself, I think that using optical illusions is interesting in order to study consciousness, but the speaker didn't really make any solid argument that consciousness is an illusion; rather, he simply seemed to say that our eyes interpolate detail for interpretation.
But that's not the focus of this post.
As an image analysis engineer, I have a different commentary. What's really important is the challenge that our brains' interpretation of our eyes' data presents for image analysis engineers.
This is a screenshot from the talk:
Our brain somehow knows that these pixels represent a face, and a specific, well known face at that. But try training a computer to recognize that. It only contains maybe a few hundred grey values in a specific order. We organize that information in our brain to interpret it as a face at a given angle, rotate it mentally, and perform a content based image retrieval from a database in our heads of known faces.
Understanding how we interpret our eyes' data will be instrumental in taking image analysis technology to the next level.
This is why image analysis hasn't replaced human observers quite yet. A pathologist looking at a cell slide can immediately interpret that slide as tumor or non-tumor. We are trying to train computers to follow the steps the pathologist mentally performs. But the pathologist doesn't necessarily know how he interprets the image, which frustrates us engineers who need to train a computer to replicate those precise steps.
When we look at an MRI brain scan, we can interpret it as a single object.
We can even know that the edges represent the same concepts at different parts of the image.
However, different spots on those borders have very different shades of grey. Our brains know they represent the same type of object, but computers are not so good at that yet.
That's just one example of how computers are trying to catch up to our brains.
Not to say that there aren't clever methods such as sophisticated edge detection algorithms, intensity normalization, curvature flow algorithms, etc. (both proprietary algorithms in companies' R&D departments as well as in public academic literature). But a lot of image analysis research has to do with teaching computers to do things which our brains already do so naturally, and quickly.
And more importantly, these tools are simply trying to replicate specific steps our brains take, whereas the actual order of which steps to take, and how analyzing an image with those specific steps, will result in a proper interpretation. In that area, very little research even comes close to learning the proper steps. The closest we've come to that is Deep Learning, but interestingly enough, the most creative thing that our most talented engineers at Google could come up with, is to try to replicate our existing biological neurons. At its essence, Deep Learning is still just another way we are trying to copy nature.
Computer vision is an area of active research and will be for many years to come; it's something to which I have dedicated my career. But perhaps if we understood ourselves and how our minds work a bit better, it would pave the way for exciting new tools which could be used to better thwart terrorism or heal the sick (the most typical use of image analysis technology).