Intuitively, you might think that Amazon cares about people being able to enter their banking details as easily as possible. In reality at least Amazon.de doesn’t seem to care. Instead of instantly validating with Javascript that a entered IBAN is valid the form waits with giving feedback till the user clicks okay.
I just used Uber the first time and the task of entering street addresses is awful.
There’s no autocomplete for street names that allows me to add the street number.
Berlin is a big city, yet the first hits in the list that was proposed are streets with the same name in other cities.
The suggestions that Uber gives don’t include postal code which are important because multiple streets in Berlin share the same street names
There seems to be no obvious way to select a street name and then choose on a map where on the street you want to be picked up. After a bit of searching I do find that I can click on the street name for pickup but the touchable area is very small and could be easily expanded into the empty whitespace above. For the destination it still seems like there’s no to add more details to the automated street name
If I write the name of a contact (in my contacts) then I don’t get the address shown
How does it come that those companies employ 1000s of software developers yet manage to do badly on the task of providing basic functions to users? What are all those engineers doing?
You may be confused about what the term “core functionality” means. I’ve worked on a couple of pretty major sites, and universally they carefully measure user behavior and any blockers to revenue. And then make cost/benefit tradeoffs about how to improve those measurements.
My suspicion is that the features you wish were implemented are BOTH harder than they appear AND imperfect implementations are worse (in terms of measured usage) than the current lack. For mapping, I don’t know anyone at Uber, but I do know a few geolocation and maps engineers at other companies, and I have some sense of the crazy diversity of confusion that users show when a UI is optimized for even a slightly different use case than they have in mind. I’d bet that many/most destinations are copy/paste or direct link from another app, rather than entered by hand.
That said, even with thousands of engineers (perhaps especially!), there are always blind spots which “everyone knows” we should fix, but it’s just a bit too small for management to notice and slightly too big for anyone to do it unilaterally. These always suck, and the best engineers and managers can recognize it and use their personal influence to get a few of them done.
I think entering payment information is functionality of any online shop and thus have a hard time as not seeing it as part of the core functionality.
On the other hand it getting European IBAN’s easily entered might not be core issue for an Engineer at Amazon.com and there’s no team at Amazon.de which has a lot less developers who’s responsible for it.
Even if that’s true some people might copy-paste streetname + street number and Uber will deliever them to the wrong address without giving them any warning beforehand. In that case showing the user that there are multiple possible addresses seems even more important.
It might be that bringing people to the wrong address isn’t measured in the metrics and reducing the amount of information makes it slighlty faster that people book.
I think entering payment information is functionality of any online shop and thus have a hard time as not seeing it as part of the core functionality.
On the other hand it getting European IBAN’s easily entered might not be core issue for an Engineer at Amazon.com and there’s no team at Amazon.de which has a lot less developers who’s responsible for it.
Even if that’s true some people might copy-paste streetname + street number and Uber will deliever them to the wrong address without giving them any warning beforehand. In that case showing the user that there are multiple possible addresses seems even more important.
The basic functions from your perspective are different from the basic functions from the perspective of the company, basically, whose perspective is necessarily limited.
I’ve been considering a set of thoughts related to this for a while, which summarize, only slightly misleadingly, as “Caring is expensive.” By caring, I mean in a very general way, rather than in a specific way; the desire to make or do something that is better than what is required. It’s a very fuzzy concept, as I’m still working on trying to identify more precisely what it is I mean, so be warned that the ideas here are relatively unrefined.
Suppose you own a company, and you need to hire somebody to deal with office supplies; it’s expensive to hire somebody who actually cares whether the company pays $1.00 or $10.00 per stapler without that being a specific thing you expect of them; odds are, the person making that decision cares more about the ergonomics of the stapler, or the color. What’s very expensive is somebody who cares about all of that; it’s cheaper to hire somebody who will pay $1.00 for staplers, regardless of anything else, than to hire somebody who will actually make the decision you would make yourself, which takes into account both the price and the ergonomics (and maybe even the style) to make a decision that reflects a broad range of concerns.
The problem scales with the size of your company, as well; if you aren’t making the hiring decisions, you need to hire somebody who cares about whether the people they hire themselves care.
(Not to mention that the incentive structures, as the size scales up, start to work against caring.)
The approach a lot of companies take, instead, is to try to de-emphasize “caring” as much as possible, and emphasize metrics and goals, which are attached to incentives; that is, you maybe get paid more if you do certain things, and since you care about getting paid more, you by association care about the things you’re assigned to do. And these are invariably incomplete. Dagon mentions use cases; if somebody doesn’t define the use-case that includes what you are trying to do, the effect is that nobody is assigned to actually ensure that that use-case is handled. Nobody is assigned to “care” about that, so nobody does.
Couldn’t you make them care by making their pay dependent on how well they predict what you would decide, as measured by you redoing the decision for a representative sample of tasks?
Ability to predict is distinct from actual caring. It doesn’t get you around people trying to goodhart metrics.
How would you goodhart this metric? To be clear, you want to map x to f(x), but this takes a second of your time. You pay them to map x to f(x), but they map x to g(x). After they’re done mapping 100 x to g(x), you select a random 10 of those 100, spend 10 seconds to calculate the corresponding g(x)-f(x), and pay them more the smaller the absolute difference.
I’m trying to think through your point using the above stapler and office supplies example.
If you hired an aggressively lazy AI to buy office supplies for you and you told it : Buy me a stapler.
Then it might buy a stapler from the nearest trash can in exchange for one speck of dirt.
Then, you would go to Staples and buy a new ergonomically friendly stapler that can handle 20 sheets (model XYZ from Brand ABC) using your credit card with free five day shipping.
You proposed that we would reward the AI by calculating the distance between the two sets of actions.
I don’t see a way for you to avoid having to invent a friendly AI to solve the problem.
Otherwise, you will inevitably leave out a metric (e.g. Oops, I didn’t have a metric for don’t-speed-on-roads so now the robot has a thousand dollar speeding ticket after optimizing for shipping speed).
I suppose for messy real-world tasks, you can’t define distances objectively ahead of time. You could simply check a random 10 (x,f(x)) and choose how much to pay. In an ideal world, if they think you’re being unfair they can stop working for you. In this world where giving someone a job is a favor, they could go to a judge to have your judgement checked.
Though if we’re talking about AIs: You could have the AI output a probability distribution g(x) over possible f(x) for each of the 100 x. Then for a random 10 x, you generate an f(x) and reward the AI according to how much probability it assigned to what you generated.
Then I have a better answer for the question about how I would Goodhart things.
Let U_Gurkenglas = the set of all possible x that may be randomly checked as you described Let U = the set of all possible metrics in the real world (superset of U_G) For a given action A, optimize for U_G and ignore the set of metrics in U that are not in U_G.
You will be unhappy to the degree that the ignored subset contains things that you wish were in U_G. But until you catch on, you will be completely satisfied by the perfect application of all x in U_G.
To put this b concrete terms, if you don’t have a metric for “nitrous oxide emissions” because it’s the 1800s, then you won’t have any way to disincentivize an employee who races around the countryside driving a diesel truck that ruins the air.
(the mobile editor doesn’t have any syntax help ; I’ll fix formatting later)
That’s a bad example. In the 1800s no matter of caring would have resulted in the person chosing a car with less nitrous oxide because that wasn’t in the things people thought about.
A big piece is that companies are extremely siloed by default. It’s pretty easy for a team to improve things in their silo, it’s significantly harder to improve something that requires two teams, it’s nearly impossible to reach beyond that.
Uber is particularly siloed, they have a huge number of microservices with small teams, at least according to their engineering talks on youtube. Address validation is probably a separate service from anything related to maps, which in turn is separate from contacts.
Because of silos, companies have to make an extraordinary effort to actually end up with good UX. Apple was an example of these, where it was literally driven by the founder & CEO of the company. Tumblr was known for this as well. But from what I heard, Travis was more of a logistics person than a UX person, etc.
(I don’t think silos explain the bank validation issue)
It might very well be that there’s no team responsible for IBAN validation at Amazon as the team that’s responsible for payment information might be an Amazon.com team that doesn’t care about IBAN’s which are only important for Amazon in Europe.
Also, they apparently did write code to validate it. And if most everyone puts in either a correct IBAN, or a correctly formatted but typoed wrong IBAN, it might be that no-one has ever complained to Amazon about having to wait for a server-side verification. There’s probably a low-priority bug ticket written by some tester sitting on their backlog, but without customer complaint or measurable loss of business it won’t get worked.
There are two levels of IBAN validation. One is to take the number and it should if you take modulo 97 result in 0. There’s also a more complex one that’s about querying a database about valid IBAN’s that multiple MB big. You can easily do the modulo 97 trick client-side for validation and then do the more complex one on your server.
I’m relatively certain (for non-public reasons) that this produces at least 100 customer complaints/year and there’s a good chance that it produces 1000s. But there’s no tracking.
If you type a 20 digit number and make a single mistake it’s unfun to submit a form and having to retype everything again.
It’s also not like this is the only thing that’s wrong with the IBAN page. It’s for example easier to type 20 digit numbers correctly when there’s a little space after every forth number.
There are also something less visible that’s wrong with the page that leads to problems that I don’t want to speak about here but in case anyone at Amazon reads this and wants to do something I’m happy to say it.