Could you give three examples of “very specific questions about specific technologies”, and perhaps one example of a dependency between two technologies and how it aids prediction?
So, suppose we just want to forecast the following: I place a really good camera with pan, zoom and a microphone in the upper corner of a room. The feed goes to a server farm, which can analyze it.
With no trouble, today we can allow the camera to photograph in infrared and some other wavelengths. Let’s do that.
When we enter a room, we also already have some information. We know whether we’re in a home, an office, a library, a hospital, a trailer or an airplane hanger. For now, let’s not have the system try to deduce that.
OK, now I want the server farm to be able to tell me exactly who is in the room, what are all of the objects in it, what are the people wearing and what are they holding in their hands. Let’s say I want that information to correctly update every ten seconds.
The problem as stated is still not fully specified, and we should assign some quantitative scales to the quality of the recognition results.
When people are trying to figure out what is in a room, they can also move around in it, pick up objects and put them down.
So, we have a relationship between object recognition and being able to path plan within a room.
People often cannot determine what an object is without reading the label. So, some NLP might be in the mix.
To determine what kind of leaf or white powder is sitting on a table, or exactly what is causing the discoloration in the grout, the system iwould require some very specialized skills.
Object recognition relies on sensors, computer memory and processing speed, and software.
Sensors:
Camera technology has run ahead very quickly. I believe that today the amount of input from cameras into the server farm can be made significantly greater than the amount of input from the eye into the brain.
I only put a single camera into my scenario, but if we are trying to max out the room’s ability to recognize objects, we can put in many.
Likewise, if the microphone is helpful in recognition, then the room can exceed human auditory abilities.
Machines have already overtaken us in being able to accept these kinds of raw data.
Memory and Processing Power:
Here is a question that requires expert thinking: So, apparently machines are recording enough video today to equal the data stream people use for visual object recognition, and computers can manipulate these images in real-time.
What versions of the object recognition task require still more memory and still faster computers, or do we have enough today?
Software
Google Goggles offers some general object recognition capabilities.
We also have voice and facial recognition.
One useful step would be to find ways to measure how successful systems like Google Goggles and facial recognition are now, then plot over time.
Getting them specific enough is pretty challenging. Usually you have to go through a several rounds of discussion in order to have a well-formulated question.
Let me try to do one in object recognition, and you can try to critique.
We need to ask many, very specific questions about specific technologies, and we need to develop maps of dependencies of one technology on another.
Could you give three examples of “very specific questions about specific technologies”, and perhaps one example of a dependency between two technologies and how it aids prediction?
So, suppose we just want to forecast the following: I place a really good camera with pan, zoom and a microphone in the upper corner of a room. The feed goes to a server farm, which can analyze it. With no trouble, today we can allow the camera to photograph in infrared and some other wavelengths. Let’s do that.
When we enter a room, we also already have some information. We know whether we’re in a home, an office, a library, a hospital, a trailer or an airplane hanger. For now, let’s not have the system try to deduce that.
OK, now I want the server farm to be able to tell me exactly who is in the room, what are all of the objects in it, what are the people wearing and what are they holding in their hands. Let’s say I want that information to correctly update every ten seconds.
The problem as stated is still not fully specified, and we should assign some quantitative scales to the quality of the recognition results.
When people are trying to figure out what is in a room, they can also move around in it, pick up objects and put them down.
So, we have a relationship between object recognition and being able to path plan within a room.
People often cannot determine what an object is without reading the label. So, some NLP might be in the mix.
To determine what kind of leaf or white powder is sitting on a table, or exactly what is causing the discoloration in the grout, the system iwould require some very specialized skills.
Continuing the example:
Object recognition relies on sensors, computer memory and processing speed, and software.
Sensors:
Camera technology has run ahead very quickly. I believe that today the amount of input from cameras into the server farm can be made significantly greater than the amount of input from the eye into the brain.
I only put a single camera into my scenario, but if we are trying to max out the room’s ability to recognize objects, we can put in many.
Likewise, if the microphone is helpful in recognition, then the room can exceed human auditory abilities.
Machines have already overtaken us in being able to accept these kinds of raw data.
Memory and Processing Power:
Here is a question that requires expert thinking: So, apparently machines are recording enough video today to equal the data stream people use for visual object recognition, and computers can manipulate these images in real-time.
What versions of the object recognition task require still more memory and still faster computers, or do we have enough today?
Software
Google Goggles offers some general object recognition capabilities.
We also have voice and facial recognition.
One useful step would be to find ways to measure how successful systems like Google Goggles and facial recognition are now, then plot over time.
With that work in hand, we can begin to forecast.
Prior to forecasting when technology will be successful, we have to define the parameters of the test very precisely.
Getting them specific enough is pretty challenging. Usually you have to go through a several rounds of discussion in order to have a well-formulated question.
Let me try to do one in object recognition, and you can try to critique.