Action space: the agent obtains data from the sensors and decides how to use the actuators (temperature modifiers, humidity, exposure to sunlight/other modifier) to maximize specific crop characteristics
The reward: the agent knows that he is performing better when he minimizes the time needed for the plants to reach specific characteristics. For example, when trying to minimize the time required for three plants to reach a specific height of 0.2m, a higher score would be attributed to the action policy that led the plant to grow faster to (0.02, 0.05, 0.10, 0.15, 0.20)m. Or say a watermelon plantation, the policy of mapping conditions of temperature, humidity (etc) that led to the emergence of the largest watermelon (given a threshold) in the shortest time possible would reward the agent with higher scores.
If it is possible to achieve high efficiency on food production using RL agents that control cheap sensors on a simple wooden box and cheap products (earth, seeds, water), we could mass produce boxes and distribute them with the embedded agent and a few rules to the final user. Users with this system would get enough food that would pay the cost of the system itself. Users could buy more boxes by selling the exceeding food, and they could distribute the boxes with neighbours, providing substantial positive impact on the world.
I really believe we should decentralize food production and it would be easier with low cost systems that automate practically the whole process, and the user would just do easy things. People would get healthier foods, they would spend less money on food (leaving more money to invest in other needs), they would develop less diseases associated with the consumption of high industrialized products or products with high amount of herbicides.
Problem: Automatic planting
Action space: the agent obtains data from the sensors and decides how to use the actuators (temperature modifiers, humidity, exposure to sunlight/other modifier) to maximize specific crop characteristics
The reward: the agent knows that he is performing better when he minimizes the time needed for the plants to reach specific characteristics. For example, when trying to minimize the time required for three plants to reach a specific height of 0.2m, a higher score would be attributed to the action policy that led the plant to grow faster to (0.02, 0.05, 0.10, 0.15, 0.20)m. Or say a watermelon plantation, the policy of mapping conditions of temperature, humidity (etc) that led to the emergence of the largest watermelon (given a threshold) in the shortest time possible would reward the agent with higher scores.
If it is possible to achieve high efficiency on food production using RL agents that control cheap sensors on a simple wooden box and cheap products (earth, seeds, water), we could mass produce boxes and distribute them with the embedded agent and a few rules to the final user. Users with this system would get enough food that would pay the cost of the system itself. Users could buy more boxes by selling the exceeding food, and they could distribute the boxes with neighbours, providing substantial positive impact on the world.
I really believe we should decentralize food production and it would be easier with low cost systems that automate practically the whole process, and the user would just do easy things. People would get healthier foods, they would spend less money on food (leaving more money to invest in other needs), they would develop less diseases associated with the consumption of high industrialized products or products with high amount of herbicides.