Nathan Helm-Burger comments on The Geometry of Feelings and Nonsense in Large Language Models

Nathan Helm-Burger 28 Sep 2024 0:12 UTC
6 points
0
The popular well-known similarity/distance metrics and clustering algorithms are not nearly as good as the best ones. I think it’d be interesting to see what the results look like using some better, newer, less-known metrics.
Examples:
- PaCMAP—a better UMAP
- DIEM—better cosine similarity. video explanation
- cosine similarity with cut initialization—a better cosine similarity
- Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) - another better cosine similarity
- TS-SS Similarity—yet another better cosine similarity
- Vector Space Model
- Fusion-based semantic similarity
- Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization
- Comparing in context: Improving cosine similarity measures with a metric tensor
- Improved sqrt-cosine similarity measurement
- Improved Heterogeneous Distance Functions
- Encyclopedia of Distances—in case you just can’t get enough, and want a whole book of distance measures!
I don’t actually know if any of these would perform better, or how they rank relative to each other for this purpose. Just wanted to give some starting points.
In case you want to google for ‘a better version of x technique’, here’s a list of a bunch of older techniques: https://rapidfork.medium.com/various-similarity-metrics-for-vector-data-and-language-embeddings-23a745f7f5a7
What links here?
- Nathan Helm-Burger's comment on Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models? by Taras Kutsyk (4 Oct 2024 18:20 UTC; 3 points)
- Nathan Helm-Burger's comment on HDBSCAN is Surprisingly Effective at Finding Interpretable Clusters of the SAE Decoder Matrix by Jaehyuk Lim (19 Oct 2024 16:39 UTC; 2 points)