On interviews, I had a great deal of success hiring for clerical assistant positions by simply getting the interviewees to do a simple problem in front of us. It turned out to be a great, reliable and easy-to-justify sorter of candidates.
But, of course, it was neither unstructured nor much of an “interview” as such.
Again, test not interview. Their GPA is an average measure of maybe thousands of such simple problems—probably on average more rigorously produced, presented, and corrected than your problem presented in the interview.
Deciding based on a test in person instead of deciding on a number that represents thousands of such individual tests smacks of anecdotal decision-making.
Unfortunately, GPAs can lie. You cannot be certain of the quality of the problems and evaluation that was averaged to produce the GPA. So running your own test of known difficulty works well to verify what you see on the resume.
For example, I have to hire programmers. We give all incoming programmers a few relatively easy programming problems as part of the interview process because we’ve found that no matter what the resume says, it’s possible that they actually do not know how to program.
Good resume + good interview result is a much stronger indicator than good resume alone.
A significant problem is the weighting of certain courses, particularly Advanced Placement ones. A GPA of 3.7, seeming quite respectable to the unaware, can be obtained by work of quality 83%, and that’s assuming the class didn’t offer extra credit.
I don’t think he is likely to hire programmers straight out of high school.
Giving IB/AP/Honors classes extra weight in high school is necessary to offset the additionally difficulty of these classes. Otherwise, high school students would have a direct disincentive to take advanced classes.
Giving IB/AP/Honors classes extra weight in high school is necessary to offset the additionally difficulty of these classes. Otherwise, high school students would have a direct disincentive to take advanced classes.
Despite conventional wisdom to the contrary, grade weighting is not the primary factor driving students to increase their AP course-taking. Moreover, a lack of institutional knowledge about the importance of grade-weighting does not have a practically significant adverse impact on students with low historical participation rates in AP, although low income students are marginally less responsive to increases in the AP grade weight than others. The minimal connection between AP grade weights and course-taking behavior may explain why schools tinker with their weights, making changes in the hopes of finding the sweet spot that elicits the desired student AP-taking rates. The results presented here suggest that there is no sweet spot and that schools should look elsewhere for ways to increase participation in rigorous courses.
But there’s still the additional incentive of prestige and signalling, isn’t there? That should be enough for the serious scholar. It’s a significant problem when non-AP-labelled courses are often passed over for the purpose of a cheap grade boost.
Since when did greater rigour and averaging of more problems imply greater degree of correlation with performance at one specific job?
I call halo effect here. Greater rigour, bigger number, more accurate, more corrected, all combined really ‘good’ qualities about the GPA value spill over into your feeling of how well it’ll correlate with performance at specific job, versus a ‘bad’ ill measured value.
Truth is, say, ill measured hand size based on eyeballing can easily correlate better with measured finger length, than body weight measured using ultra high precision scientific scales with accuracy of a milligram (microgram, nanogram, whatever). Just because hammer is a tool you build things with, and butter knife is a kitchen utensil, doesn’t make hammer better than butter knife as a screw driver.
Just because hammer is a tool you build things with, and butter knife is a kitchen utensil, doesn’t make hammer better than butter knife as a screw driver.
But more on point, you’d need to justify that the test you give is more correlated than GPA with performance—this is why I support simple programming tests (because they demonstrably are more correlated than academic indicators) but for a ‘clerical assistant’ position as described above, a specific test doesn’t immediately spring to mind, and so it’s suspect.
You aren’t looking for ‘correlation’ usually, you’re looking for screening out the serial job applicant who can’t do the job they’re applying for (and keeps re-applying to many places)… just ask ’em to do some work similar to what they will be doing as per LorenzofromOz method, and you’ll at least be assured they can do work. While with GPA you won’t be assured of anything what so ever.
For the programming, the simplest dumbest check works to screen out those entirely incapable, when screening by PhD would not.
PhD might correlate better with performance than fizzbuzz does (the latter being a binary test of extremely basic knowledge), but PhD does not screen out those who will just waste your time, and fizzbuzz (your personal variation of it) does.
Holy crap… I think I had read about the FizzBuzz thing a while ago, but I didn’t remember about the 199 in 200 thing… Would it be possible to sue the institutions issuing those PhD or something? :-)
Well, I don’t know what % of the CS-related PhDs can’t do FizzBuzz, maybe the percentage is rather small. (Also, sue for what? You are not their client. The incapable dude that was given a degree, that’s their client. Your over-valuation of this degree as evidence of capability is your own problem)
The issue is that, as Joel explains, the job applicants are a sample extremely biased towards incompetence:
[Though I would think that the incompetents with degrees would be more able to find incompetent employer to work at. And PhDs should be able to find a company that hires PhDs for signalling reasons]
The issue with the hiring methods here, is that we easily confuse “more accurate measurement of X” with “stronger correlation to Y”, and “stronger correlation to Y” with hiring better staff (the one that doesn’t sink your company), usually out of some dramatically different population than the one on which correlation was found.
Furthermore, a ‘correlation’ is such an inexact measure of how test relates to performance. Comparing correlations is like comparing apples to oranges by weight. The ‘fizzbuzz’ style problems measure performance near the absolute floor level, but with very high reliability. Virtually no-one who fails fizzbuzz is a good hire. Virtually no-one who passes fizzbuzz (an unique fizzbuzz, not the popular one) is completely incapable of programming. The degrees correlate to performance at the higher level, but with very low reliability—there are brilliant people with degrees, there are complete incompetents with degrees, there’s brilliant people and incompetents without degrees.
Reversing a linked list is a good one unless the candidate knows how to. See, the issue is that educational institutions don’t teach how to think up a way to reverse linked list. Nor do they test for that. They might teach how to reverse the linked list, then they might test if the person can reverse the linked list. Some people learn to think of a way to solve such problems. Some don’t. It’s entirely incidental.
On interviews, I had a great deal of success hiring for clerical assistant positions by simply getting the interviewees to do a simple problem in front of us. It turned out to be a great, reliable and easy-to-justify sorter of candidates.
But, of course, it was neither unstructured nor much of an “interview” as such.
Again, test not interview. Their GPA is an average measure of maybe thousands of such simple problems—probably on average more rigorously produced, presented, and corrected than your problem presented in the interview.
Deciding based on a test in person instead of deciding on a number that represents thousands of such individual tests smacks of anecdotal decision-making.
Unfortunately, GPAs can lie. You cannot be certain of the quality of the problems and evaluation that was averaged to produce the GPA. So running your own test of known difficulty works well to verify what you see on the resume.
For example, I have to hire programmers. We give all incoming programmers a few relatively easy programming problems as part of the interview process because we’ve found that no matter what the resume says, it’s possible that they actually do not know how to program.
Good resume + good interview result is a much stronger indicator than good resume alone.
A significant problem is the weighting of certain courses, particularly Advanced Placement ones. A GPA of 3.7, seeming quite respectable to the unaware, can be obtained by work of quality 83%, and that’s assuming the class didn’t offer extra credit.
I don’t think he is likely to hire programmers straight out of high school.
Giving IB/AP/Honors classes extra weight in high school is necessary to offset the additionally difficulty of these classes. Otherwise, high school students would have a direct disincentive to take advanced classes.
A swift googling brings up this forthcoming study of about 900 high schools in Texas:
But there’s still the additional incentive of prestige and signalling, isn’t there? That should be enough for the serious scholar. It’s a significant problem when non-AP-labelled courses are often passed over for the purpose of a cheap grade boost.
Since when did greater rigour and averaging of more problems imply greater degree of correlation with performance at one specific job?
I call halo effect here. Greater rigour, bigger number, more accurate, more corrected, all combined really ‘good’ qualities about the GPA value spill over into your feeling of how well it’ll correlate with performance at specific job, versus a ‘bad’ ill measured value.
Truth is, say, ill measured hand size based on eyeballing can easily correlate better with measured finger length, than body weight measured using ultra high precision scientific scales with accuracy of a milligram (microgram, nanogram, whatever). Just because hammer is a tool you build things with, and butter knife is a kitchen utensil, doesn’t make hammer better than butter knife as a screw driver.
Well, actually...
But more on point, you’d need to justify that the test you give is more correlated than GPA with performance—this is why I support simple programming tests (because they demonstrably are more correlated than academic indicators) but for a ‘clerical assistant’ position as described above, a specific test doesn’t immediately spring to mind, and so it’s suspect.
You aren’t looking for ‘correlation’ usually, you’re looking for screening out the serial job applicant who can’t do the job they’re applying for (and keeps re-applying to many places)… just ask ’em to do some work similar to what they will be doing as per LorenzofromOz method, and you’ll at least be assured they can do work. While with GPA you won’t be assured of anything what so ever.
For the programming, the simplest dumbest check works to screen out those entirely incapable, when screening by PhD would not.
http://www.codinghorror.com/blog/2007/02/why-cant-programmers-program.html
PhD might correlate better with performance than fizzbuzz does (the latter being a binary test of extremely basic knowledge), but PhD does not screen out those who will just waste your time, and fizzbuzz (your personal variation of it) does.
Holy crap… I think I had read about the FizzBuzz thing a while ago, but I didn’t remember about the 199 in 200 thing… Would it be possible to sue the institutions issuing those PhD or something? :-)
Well, I don’t know what % of the CS-related PhDs can’t do FizzBuzz, maybe the percentage is rather small. (Also, sue for what? You are not their client. The incapable dude that was given a degree, that’s their client. Your over-valuation of this degree as evidence of capability is your own problem)
The issue is that, as Joel explains, the job applicants are a sample extremely biased towards incompetence:
http://www.joelonsoftware.com/items/2005/01/27.html
[Though I would think that the incompetents with degrees would be more able to find incompetent employer to work at. And PhDs should be able to find a company that hires PhDs for signalling reasons]
The issue with the hiring methods here, is that we easily confuse “more accurate measurement of X” with “stronger correlation to Y”, and “stronger correlation to Y” with hiring better staff (the one that doesn’t sink your company), usually out of some dramatically different population than the one on which correlation was found.
Furthermore, a ‘correlation’ is such an inexact measure of how test relates to performance. Comparing correlations is like comparing apples to oranges by weight. The ‘fizzbuzz’ style problems measure performance near the absolute floor level, but with very high reliability. Virtually no-one who fails fizzbuzz is a good hire. Virtually no-one who passes fizzbuzz (an unique fizzbuzz, not the popular one) is completely incapable of programming. The degrees correlate to performance at the higher level, but with very low reliability—there are brilliant people with degrees, there are complete incompetents with degrees, there’s brilliant people and incompetents without degrees.
edit: other example:
http://blog.rethinkdb.com/will-the-real-programmers-please-stand-up
Reversing a linked list is a good one unless the candidate knows how to. See, the issue is that educational institutions don’t teach how to think up a way to reverse linked list. Nor do they test for that. They might teach how to reverse the linked list, then they might test if the person can reverse the linked list. Some people learn to think of a way to solve such problems. Some don’t. It’s entirely incidental.