TLDR:full task completion training data will soon be available, leading to much more capable agentic AI
Capabilities predictions:
Agentic AIs for all white collar tasks
creative problem solving might be solved
Business predictions:
MS/GOOG try to capture all the data (web browsing, (web/native)application interactions)
(maybe)creepy examples attempting better un-black-boxing of humans doing the problem solving
watching the user’s face
eye tracking
Caveats:
TOS/legal limits on data retention/use for training
security(EG:prompt injection) for finished systems
TLDR_END
Microsoft is introducing copilot for office 365. Google is integrating AI into its products. Economic motivation for automating white collar task labor hours is obvious even if only to sell to workers directly. Where can the needed training data be gathered? Companies using MS tech stack (Windows(OS), Edge(browser), outlook(email), office365(documents)) are fully set up to capture all human computer interactions. Google can do the same via Chrome/ChromeOS for web application based workflows.
Ways to make better agentic AIs:
do task unreliably and let human select among end results
Microsoft low code/ no code
basic approval feedback system (EG:chatgpt thumbs up)
watch people doing a task to generate a problem solving log for that task
(1) leads to refinement and reliability improvement for existing capabilities. (2) leads to new capabilities altogether
Useful characteristics of problem solving data
homogenous
real time human computer interaction logs
no missing steps (except those that occur inside human brain)
tagged by user
can use “user vector” to condition model output as in image models conditioned by text
learn from the dumbest act like the smartest
less RL fine tuning required
Capability Predictions
This leads to capable narrow domain agents
creativity is still hard and 99%+ of humans aren’t doing creative problem solving in day to day life
capturing the 1% could have huge impacts
can do RL or conditioning to elicit creative problem solving behavior
curation is easier than creation allowing lower skill humans to give feedback
EG:”Rate the automated tech support assistant”
This approach tops out somewhere unless RSI/takeoff occurs somehow
skill synergies may arise (EG:the model writes code to automate tasks rather than completing them one at a time)
this would be the “creative” part that leads to true AGI when fully generalized.
real world example:
tech support
need to troubleshoot software/hardware issues
data collection:
support agent (actions/chat)
“did they solve my problem” user ratings.
supplement by extracting info from chat logs
gather training data from existing remote support techs doing remote desktop type support
train capability into model via imitation learning
deploy model to small fraction of users and gather KPIs to determine readiness
gradually replace human agents with AI
switch from imitation to reinforcement learning
synergies with all sorts of other skills
programming, CLI usage, network exploration etc.
synergies with access to other corporate data:
internal network information
server logs
copilot type systems will get more capable until they become “autopilot” systems. Jobs switch to curating AI outputs with supervision dropping off over time. Even if the machine can’t “pilot” at all it can still learn.
Business predictions:
Changes to tech giant TOSes to allow much more data collection.
Tech giants move to capture business segments, providing full solutions to generic problems (EG:accounting, customer service, IT support) or creating a commoditised market for solutions to placate regulators. Privacy/security concerns lead to centralisation (see Zvi’s AI post for more).
“Copilot” type AI integration could lead to training data needed for AGI
TLDR:full task completion training data will soon be available, leading to much more capable agentic AI
Capabilities predictions:
Agentic AIs for all white collar tasks
creative problem solving might be solved
Business predictions:
MS/GOOG try to capture all the data (web browsing, (web/native)application interactions)
(maybe)creepy examples attempting better un-black-boxing of humans doing the problem solving
watching the user’s face
eye tracking
Caveats:
TOS/legal limits on data retention/use for training
security(EG:prompt injection) for finished systems
TLDR_END
Microsoft is introducing copilot for office 365. Google is integrating AI into its products. Economic motivation for automating white collar task labor hours is obvious even if only to sell to workers directly. Where can the needed training data be gathered? Companies using MS tech stack (Windows(OS), Edge(browser), outlook(email), office365(documents)) are fully set up to capture all human computer interactions. Google can do the same via Chrome/ChromeOS for web application based workflows.
Ways to make better agentic AIs:
do task unreliably and let human select among end results
Microsoft low code/ no code
basic approval feedback system (EG:chatgpt thumbs up)
watch people doing a task to generate a problem solving log for that task
(1) leads to refinement and reliability improvement for existing capabilities. (2) leads to new capabilities altogether
Useful characteristics of problem solving data
homogenous
real time human computer interaction logs
no missing steps (except those that occur inside human brain)
tagged by user
can use “user vector” to condition model output as in image models conditioned by text
learn from the dumbest act like the smartest
less RL fine tuning required
Capability Predictions
This leads to capable narrow domain agents
creativity is still hard and 99%+ of humans aren’t doing creative problem solving in day to day life
capturing the 1% could have huge impacts
can do RL or conditioning to elicit creative problem solving behavior
curation is easier than creation allowing lower skill humans to give feedback
EG:”Rate the automated tech support assistant”
This approach tops out somewhere unless RSI/takeoff occurs somehow
skill synergies may arise (EG:the model writes code to automate tasks rather than completing them one at a time)
this would be the “creative” part that leads to true AGI when fully generalized.
real world example:
tech support
need to troubleshoot software/hardware issues
data collection:
support agent (actions/chat)
“did they solve my problem” user ratings.
supplement by extracting info from chat logs
gather training data from existing remote support techs doing remote desktop type support
train capability into model via imitation learning
deploy model to small fraction of users and gather KPIs to determine readiness
gradually replace human agents with AI
switch from imitation to reinforcement learning
synergies with all sorts of other skills
programming, CLI usage, network exploration etc.
synergies with access to other corporate data:
internal network information
server logs
copilot type systems will get more capable until they become “autopilot” systems. Jobs switch to curating AI outputs with supervision dropping off over time. Even if the machine can’t “pilot” at all it can still learn.
Business predictions:
Changes to tech giant TOSes to allow much more data collection.
Tech giants move to capture business segments, providing full solutions to generic problems (EG:accounting, customer service, IT support) or creating a commoditised market for solutions to placate regulators. Privacy/security concerns lead to centralisation (see Zvi’s AI post for more).