I’ve been working fully remotely and have meaningfully contributed to global organizations without physical presence for over a decade. I see parallels with anti-remote and anti-safety arguments.
I’ve observed the robust debate regarding ‘return to work’ vs ‘remote work,’ with many traditional outlets proposing ‘return to work’ based on a series of common criteria. I’ve seen ‘return to work’ arguments assert remote employees are lazy, unreliable or unproductive when outside the controlled work environment. I would generalize the rationale as an assertion that ‘work quality cannot be assured if it cannot be directly measured.’ Given modern technology allows us to measure employee work product remotely, and given the distributed work of employees across different offices for many companies, this argument seems fundamentally flawed and perhaps even intentionally misleading. My belief in the arguments being misleading is compounded by my observations that these articles never mention related considerations like cost of rental/ownership of property and the handling of those costs, nor elements like cultural emphasis on predictable work targets or management control issues.
In my view, the reluctance to embrace remote work often distills to a failure to see beyond immediate, egocentric concerns. Along the same lines, I see failure to plan for or prioritize AI safety as stemming from a similar inability to perceive direct, observable consequences to the party promoting anti-safety mindsets.
Anecdotally, I came across an article that proposed a number of cultural goals for successful remote work. I shared the article with my company via our Slack. I emphasized that it wasn’t the goals themselves that were important, but rather adopting a culture that made those goals critical. I suggested that Goodhart’s Law applied here- once a measure becomes a target, it ceases to be a good measure. A culture that values and principals beyond the listed goals would succeed, not just a culture that blindly pursues the listed goals.
I believe the same can be said for AI Safety. Focusing on specific risks, or specific practices won’t create a culture of safety. Instead, as the post (above) suggests, a culture that does not value the principals behind a safety-first mentality will attempt to merely meet the goals, or work around the goals, or undermine the goals. Much as some advocates for “return to work” are egocentrically misrepresenting remote work, some anti-safety advocates are egocentrically misrepresenting safety. For this reason, I’ve been researching the history of adoption of a safety mentality, to see how I can promote a safety-first culture. Otherwise I think we (both my company, and the industry as a whole) risk prioritizing egocentric, short-term goals over societal benefit and long-term goals.
Observations on the history of adopting “Safety First” mentalities
I’ve been looking at the human history about adoption of safety culture, and invariably, it seems to me that safety mindsets are adopted only after loss, usually loss of human life. It is described anecdotally in the paper associated with this post.
The specifics of how safety culture is implemented differ, but the broad outlines are similar. Most critical for the development of the idea of safety culture were efforts launched in the wake of the 1979 Three Mile Island nuclear plant accident and near-meltdown. In that case, a number of reports noted the various failures, and noted that in addition to the technical and operational failures, there was a culture that allowed the accidents to occur. The tremendous public pressure led to significant reforms, and serves as a prototype for how safety culture can be developed in an industry.
Emphasis added by me.
NOTE: I could not find any indication of loss of human life attributed to Three Mile Island, but both Chernobyl and Fukushima happened after Three Mile Island, and both did result in loss of human life. It’s also important to note that both Chernobyl and Fukushima were both classed INES Level 7, compared to Three Mile Island which was classed INES Level 5. This evidence is contradictory to what was in the quoted part of the paper. (And, sadly, I think supports an argument that Goodhart’s Curse is in play… that safety regressed to the mean… that by establishing minimum safety criteria instead of a safety culture, certain disasters not only could not be avoided but were more pronounced than previous disasters.) So both of the worst reactor disasters in human history occurred after the safety cultures that were promoted following Three Mile Island.[1][2] The list of nuclear accidents is longer than this, but not all accidents result in loss.[3][2:1] (This is something that I’ve been looking at for a while, to inform my predictions about the probability of humans adopting AI safety practices with regards to pre- or post- AI disasters.)
Personal contribution and advocacy
In my personal capacity (read: area of employment) I’m advocating for adversarial testing of AI chatbots. I am highlighting the “accidents” that have already occurred: Microsoft Tay Tweets[4], SnapChat AI Chatbot[5], Tessa Wellness Chatbot[6], Chai Eliza Chatbot[7].
I am promoting the mindset that if we want to be successful with artificial intelligence, and do not want to become a news article, that we should test expressly for ways that the chatbot can be diverted from the chatbots primary function, and design (or train) fixes for those problems. It requires creativity, persistence and patience… but the alternative is that one day, we might be in the news if we fail to proactively address the challenges that obviously face anyone who is trying to use artificial intelligence.
And, like my advocacy about looking at what values a culture should have that wants to adopt a pro-remote culture and be successful at it, we should look at what values a culture should have that wants to adopt a pro-safety-first culture and be successful at it.
I’ll be cross posting the original paper to my work. Thank you for sharing.
DISCLAIMER: AI was used to quality check my post, assessing for consistency, logic and soundness in reasoning and presentation styles. No part of the writing was authored by AI.
Cultural norms and egocentricity
I’ve been working fully remotely and have meaningfully contributed to global organizations without physical presence for over a decade. I see parallels with anti-remote and anti-safety arguments.
I’ve observed the robust debate regarding ‘return to work’ vs ‘remote work,’ with many traditional outlets proposing ‘return to work’ based on a series of common criteria. I’ve seen ‘return to work’ arguments assert remote employees are lazy, unreliable or unproductive when outside the controlled work environment. I would generalize the rationale as an assertion that ‘work quality cannot be assured if it cannot be directly measured.’ Given modern technology allows us to measure employee work product remotely, and given the distributed work of employees across different offices for many companies, this argument seems fundamentally flawed and perhaps even intentionally misleading. My belief in the arguments being misleading is compounded by my observations that these articles never mention related considerations like cost of rental/ownership of property and the handling of those costs, nor elements like cultural emphasis on predictable work targets or management control issues.
In my view, the reluctance to embrace remote work often distills to a failure to see beyond immediate, egocentric concerns. Along the same lines, I see failure to plan for or prioritize AI safety as stemming from a similar inability to perceive direct, observable consequences to the party promoting anti-safety mindsets.
Anecdotally, I came across an article that proposed a number of cultural goals for successful remote work. I shared the article with my company via our Slack. I emphasized that it wasn’t the goals themselves that were important, but rather adopting a culture that made those goals critical. I suggested that Goodhart’s Law applied here- once a measure becomes a target, it ceases to be a good measure. A culture that values and principals beyond the listed goals would succeed, not just a culture that blindly pursues the listed goals.
I believe the same can be said for AI Safety. Focusing on specific risks, or specific practices won’t create a culture of safety. Instead, as the post (above) suggests, a culture that does not value the principals behind a safety-first mentality will attempt to merely meet the goals, or work around the goals, or undermine the goals. Much as some advocates for “return to work” are egocentrically misrepresenting remote work, some anti-safety advocates are egocentrically misrepresenting safety. For this reason, I’ve been researching the history of adoption of a safety mentality, to see how I can promote a safety-first culture. Otherwise I think we (both my company, and the industry as a whole) risk prioritizing egocentric, short-term goals over societal benefit and long-term goals.
Observations on the history of adopting “Safety First” mentalities
I’ve been looking at the human history about adoption of safety culture, and invariably, it seems to me that safety mindsets are adopted only after loss, usually loss of human life. It is described anecdotally in the paper associated with this post.
Emphasis added by me.
NOTE: I could not find any indication of loss of human life attributed to Three Mile Island, but both Chernobyl and Fukushima happened after Three Mile Island, and both did result in loss of human life. It’s also important to note that both Chernobyl and Fukushima were both classed INES Level 7, compared to Three Mile Island which was classed INES Level 5. This evidence is contradictory to what was in the quoted part of the paper. (And, sadly, I think supports an argument that Goodhart’s Curse is in play… that safety regressed to the mean… that by establishing minimum safety criteria instead of a safety culture, certain disasters not only could not be avoided but were more pronounced than previous disasters.) So both of the worst reactor disasters in human history occurred after the safety cultures that were promoted following Three Mile Island.[1][2] The list of nuclear accidents is longer than this, but not all accidents result in loss.[3][2:1] (This is something that I’ve been looking at for a while, to inform my predictions about the probability of humans adopting AI safety practices with regards to pre- or post- AI disasters.)
Personal contribution and advocacy
In my personal capacity (read: area of employment) I’m advocating for adversarial testing of AI chatbots. I am highlighting the “accidents” that have already occurred: Microsoft Tay Tweets[4], SnapChat AI Chatbot[5], Tessa Wellness Chatbot[6], Chai Eliza Chatbot[7].
I am promoting the mindset that if we want to be successful with artificial intelligence, and do not want to become a news article, that we should test expressly for ways that the chatbot can be diverted from the chatbots primary function, and design (or train) fixes for those problems. It requires creativity, persistence and patience… but the alternative is that one day, we might be in the news if we fail to proactively address the challenges that obviously face anyone who is trying to use artificial intelligence.
And, like my advocacy about looking at what values a culture should have that wants to adopt a pro-remote culture and be successful at it, we should look at what values a culture should have that wants to adopt a pro-safety-first culture and be successful at it.
I’ll be cross posting the original paper to my work. Thank you for sharing.
DISCLAIMER: AI was used to quality check my post, assessing for consistency, logic and soundness in reasoning and presentation styles. No part of the writing was authored by AI.
https://www.processindustryforum.com/energy/five-worst-nuclear-disasters-history
https://en.wikipedia.org/wiki/Nuclear_and_radiation_accidents_and_incidents
https://ieer.org/resource/factsheets/table-nuclear-reactor-accidents/
https://en.wikipedia.org/wiki/Tay_(chatbot)
https://www.washingtonpost.com/technology/2023/03/14/snapchat-myai/
https://www.nytimes.com/2023/06/08/us/ai-chatbot-tessa-eating-disorders-association.html
https://www.complex.com/life/father-dies-by-suicide-conversing-with-ai-chatbot-wife-blames
Thanks, this is great commentary.
On your point about safety culture after 3MI, when it took hold, and regression to the mean, see this article: https://www.thenation.com/article/archive/after-three-mile-island-rise-and-fall-nuclear-safety-culture/ Also, for more background about post-3MI safety, see this report: https://inis.iaea.org/collection/NCLCollectionStore/_Public/34/007/34007188.pdf?r=1&r=1