Subject: On practical machine learning, taking you from the 101 to being aware of practical edge cases before using in business settings. The parts of MLE you wouldn’t get from coding tutorials or math.
Recomend:
Chip Huyen’s Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications. Very well structured and complete. The GitHub repo also has links to practical production blogs.
Over
-
Machine Learning Engineering by Andriy Burkov. Not quite as full or well written as Huyen’s, but did cover slightly different topics (eg always need cohort plots; use Kaplan-Meir to estimate cohorts; more talk of agreement metrics; how to calibrate models and how calibrated models are easier to monitor)
-
Machine Learning Design Patterns by Michael Munn, Sara Robinson, and Valliappa Lakshmanan. Was basically only an introduction to Big Query syntax and GCP offerings. Though the section on data split was a little better than the other 2.
I think you’re asking the VLLM to do too much in a single call.
I was trying to get the VLLM to extract GD&T data from the NIST standardized renders of ASME Y14 tolerancing standards (eg. screenshotting 1 page at a time from https://www.nist.gov/system/files/documents/noindex/2022/04/06/nist_ftc_06_asme1_rd.pdf ).
Asking “List all GD&T and their matching element IDs” would totally fail, but would start to be reasonably when I only asked for 5-8 specific element ids at a time, and asking for the data of a single element id at a time with few shot examples would be ~90% accurate.
While right now you have to build the scaffold for them to perform well, eventually an agent could realize it needs to build it’s own scaffold or the model would be able to effectively use all parts of the context window.
I’d be curious how well the models do if you wrote out a 7 step list, and then asked about each step in isolation, and then asked a model to summarize the results of 7 separate calls.