cousin_it comments on Open Thread, Feb. 27 - March 5, 2017

cousin_it Feb 27, 2017, 8:56 AM
4 points
Maybe it was too hard.

Here’s another problem that might be easier. Make an O(n log n) sorting algorithm that’s simple, stable, and in place. Today you can only get two out of three (merge sort isn’t in place, heap sort isn’t stable, and block sort isn’t simple).
- cousin_it Mar 5, 2017, 11:33 AM
  2 points
  Parent
  I’ve read some papers (Trabb Pardo, Huang-Langston 1, Huang-Langston 2, Munro-Raman-Salowe, Katajainen-Pasanen 1, Katajainen-Pasanen 2) and there seems to be a “wall” at sqrt(n) extra space. If we have that much space, we can write a reasonable-looking mergesort or quicksort that’s stable, in place and O(n log n). But if we have strictly O(1) space, the algorithms become much more complicated, using a sqrt(n) buffer inside the array to encode information about the rest. Breaking through that wall would be very interesting to me.
- gjm Mar 2, 2017, 2:49 PM
  2 points
  Parent
  This claims to be a stable in-place sort and doesn’t seem outrageously complicated. I haven’t verified that it is stable, in-place, or in fact a sort at all.
  - cousin_it Mar 2, 2017, 2:56 PM
    4 points
    Parent
    It’s O(n log^2 n) because it merges subarrays using something like STL’s inplace_merge which is O(n log n). Devising an O(n) in-place merge, and thus an O(n log n) merge sort, is much harder. GrailSort and WikiSort have working implementations, both are over 500 lines.
    - gjm Mar 2, 2017, 7:39 PM
      0 points
      Parent
      Ah, good catch.
- cousin_it Mar 2, 2017, 1:14 PM
  2 points
  Parent
  I spent a day on this problem and now I understand why block sort has to be so complex. Here’s two simpler problems that are still surprisingly hard:
  
  1) Given an array [a1, a2, …, an, b1, b2, …, bn], rearrange it into [a1, b1, a2, b2, …, an, bn] using O(n) time and O(1) space.
  
  2) Given an array containing only zeroes and ones, sort them stably in O(n log n) time and O(1) space.
  
  Both are special cases of sorting, and I have beautiful solutions to both, but they are completely different!
  - cousin_it Mar 3, 2017, 9:17 AM
    0 points
    Parent
    If an array of length n contains k distinct values, I can do this:
    
    1) Gather the first occurrence of each value into a contiguous buffer.
    
    2) Divide the array into subarrays of length k and merge sort them, using the buffer as swap space.
    
    3) Merge the subarrays with inplace_merge.
    
    The first two steps can always be done in O(n log n). The third step seems to be okay when either k or n/k are bounded by a constant, but I’m not sure about the general case. Thinking some more.
- Thomas Feb 27, 2017, 9:17 AM
  2 points
  Parent
  This is an EXCELLENT problem. The industry needs this kind of algorithm very much. Three quarters of the industry is too stupid to use it, but one quarter would be delighted.
  
  Unfortunately, the most probable solution of this problem is the proof that this kind of sorting algorithm doesn’t exist.
- Douglas_Knight Feb 28, 2017, 3:36 AM
  0 points
  Parent
  What do you mean by “in place”? Can’t you make the merges be in place, at the cost of more writes?
- ChristianKl Feb 27, 2017, 9:32 AM
  0 points
  Parent
  What does “simple” mean here?
  - cousin_it Feb 27, 2017, 9:58 AM
    0 points
    Parent
    Just use any definition that feels reasonable to you. If you have two solutions that are simple under different definitions, I want to see both!
    - ChristianKl Feb 27, 2017, 12:08 PM
      0 points
      Parent
      I’m not seeing how it’s an issue if an algorithm isn’t simple, so I’m interested in why you consider a certain simplicity to be desireable.
      - tut Feb 27, 2017, 3:51 PM
        0 points
        Parent
        A simpler algorithm runs faster. Since these things occasionally get used for practical things that might matter. Then some people have an aesthetic preference for simpler models, maybe related to having learned Occam’s razor too early.
        Lumifer Feb 27, 2017, 3:59 PM
        3 points
        Parent
        
        A simpler algorithm runs faster.
        
        No, not necessarily.
      - cousin_it Feb 27, 2017, 12:42 PM
        0 points
        Parent
        This is going nowhere, I came to buy chocolate, not argue its merits over vanilla. If you haven’t got chocolate, let’s wrap this up.