jbeshir

Karma: 205

Petrov Day Celebration 2019 - Oxford Campsite

jbeshirAug 24, 2019, 3:42 AM

8 points

1 comment1 min readLW link

jbeshir Jan 28, 2019, 6:45 AM
7 points
in reply to: gjm’s comment on: Prediction Contest 2018: Scores and Retrospective
PredictionBook itself has a bunch more than three participants and functions as an always-running contest for calibration, although it’s easy to cheat since it’s possible to make and resolve whatever predictions you want. I also participate in GJ Open, which has an eternally ongoing prediction contest. So there’s stuff out there where people who want to compete on running score can do so.
The objective of the contest was less to bring such an opportunity into existence as to see if it’d incentivise some people who had been “meaning” to practice prediction-making and not gotten to it yet to do so on on one of the platforms, by offering a kind of “reason to get around to it now”; the answer was no, though.
I don’t participate much on Metaculus because for my actual, non-contest prediction-making practice, I tend to favour predictions that resolve within about six weeks, because the longer the time between prediction and resolution, the slower the iteration process on improving calibration; if I predict on 100 things that happen in four years, it takes four years for me to learn if I’m over or under confident at the 90% or so mark, and then another four years for me to learn if my reaction to that was an over or under reaction. Metaculus seems to favour predictions 2-4 or more years out, and requires sticking with private predictions to create your own short term ones in number, which is interesting for getting a crowd read on the future, but doesn’t offer me so much of an opportunity to iterate and improve. It’s a nice project, though.

jbeshir Jan 28, 2019, 6:11 AM
2 points
in reply to: wb’s comment on: Prediction Contest 2018: Scores and Retrospective
It’s not a novel algorithm type, just a learning project I did in the process of learning ML frameworks, a fairly simple LSTM + one dense layer, trained on the predictions + resolution of about 60% of the resolved predictions from PredictionBook as of September last year (which doesn’t include any of the ones in the contest). The remaining resolved predictions were used for cross-validation or set aside as a test set. An even simpler RNN is only very slightly less good, though.
The details of how the algorithm works are thus somewhat opaque but from observing the way it reacts to input, it seems to lean on the average, weight later in sequence predictions more heavily (so order matters) and get more confident with number of predictions, while treating the propositions with only one probability assignment as probably being heavily overconfident. It seems to have more or less learnt that insight Tetlock pointed out on its own. Disagreement might also matter to it, not sure.
It’s on GitHub at https://github.com/jbeshir/moonbird-predictor-keras; this doesn’t include the data, which I downloaded using https://github.com/jbeshir/predictionbook-extractor. It’s not particularly tidy though, and still includes a lot of unused functionality for input features- the words of the proposition, the time between a probability assignment and the due time, etc- which I didn’t end up using because the dataset was too small for it to learn any signal in them.
I’m currently working on making the online frontend to the model automatically retrain the model at intervals using freshly resolved predictions, mostly for practice building a simple “online” ML system before I move on to trying to build things with more practical application.
The main reason I ran figures for it against the contest was that some of its individual confidences seemed strange to me, and while the cross-validation stuff was saying it was good, I was suspicious I was getting something wrong in the process.

Prediction Contest 2018: Scores and Retrospective

jbeshirJan 27, 2019, 5:20 PM

28 points

5 comments1 min readLW link

Petrov Day Celebration 2018 - Oxford Campsite

jbeshirAug 30, 2018, 3:43 AM

5 points

0 comments1 min readLW link

jbeshir May 25, 2018, 8:35 AM
4 points
on: Duncan Sabien on Moderating LessWrong
I’m concerned that the described examples of holding individual comments to high epistemic standards don’t seem to necessarily apply to top-level posts, or linked content- one reason I think this is bad is that it is hard to precisely critique something which is not in itself precise, or which contains metaphor, or which contains example-but-actually-pointing-at-a-class writing where the class can be construed in various different ways.
Critique of fuzzy intuitions and impressions and feelings often involves fuzzy intuitions and impressions and feelings, I think- and if this stuff is restricted in critique but not in top level content it makes top level content involving fuzzy intuitions and impressions and feelings hard to critique, despite I think being exactly the content which needs critiquing the most.
Strong comment standards seem like they would be good for a space (no strong opinion on whether LW should be that space), but it would probably want to also have high standards in top level posts, possibly review and feedback prior to publication, to keep them up to the same epistemic standards. Otherwise I think moderation argument over which interpretations of vague content were reasonable would dominate.
Additionally, strong disagree on “weaken the stigma around defensiveness” as an objective of moderation. One should post arguments because one believes they are valid, and clarify misunderstandings because they are wrong, not argue or post or moderate to try to save personal status. It may be desirable to post and act with the objective of making it easier to not be defensive, but we still want people in themselves to try to avoid taking it as a referendum on their person. In terms of fairness, I’m not sure how you’d judge it- it is valid for the part people have most concerns about to not be the part which is desired to be given the most attention, I think, in even formal peer review. It’s also valid for most people to disagree with and have critiques of a piece of content. The top level post author (or the link post’s author) doesn’t have a right to “win”- it is permissible for the community to just not think a post’s object level content is all that good. If there was to be a fairness standard that justified anything, it’d certainly want to be spelled out in more detail and checked by someone other than the person feeling they were treated unfairly.

jbeshir May 1, 2018, 11:54 AM
1 point
in reply to: zulupineapple’s comment on: Prediction Contest 2018
It might be nice to have a set of twenty EA questions, a set of twenty ongoing-academic-research questions, a set of twenty general tech industry questions, a set of twenty world politics questions for the people who like them maybe, and run multiple contests at some point which refine predictive ability within a particular domain, yeah.
It’d be a tough time to source that many, and I feel that twenty is already about the minimum sample size I’d want to use, and for research questions it’d probably require some crowdsourcing of interesting upcoming experiments to predict on, but particularly if help turns out to be available it’d be worth considering if the smaller thing works.

jbeshir May 1, 2018, 8:59 AM
2 points
in reply to: zulupineapple’s comment on: Prediction Contest 2018
The usefulness of a model of the particular area was something I considered in choosing between questions, but I had a hard time finding a set of good non-personal questions which had very high value to model. I tried to pick questions which in some way depended on interesting underlying questions-for example, the Tesla one hinges on your ability to predict the performance of a known-to-overpromise entrepreneur in a manner that’s more precise than either maximum cynicism or full trust, and the ability to predict ongoing ramp-up of manufacturing of tech facing manufacturing difficulties, both of which I think have value.
World politics are I think the weakest section in that regard, and this is a big part of why rather than just taking twenty questions from the various sources of world politics predictions I had available, I looked for other questions, and made a bunch of my own EA-related ones by going through EA org posts looking for uncertain pieces of the future, reducing the world politics questions down to only a little over a third of the set.
That said, I think the world politics do have transferability in calibration if not precision (you can learn to be accurate on topics you don’t have a precise model for by having a good grasp of how confident you should be), and the general skill of skimming a topic, arriving at impressions about it, and knowing how much to trust those impressions. I think there are general skills of rationality being practiced here, beyond gaining specific models.
And I think while it is the weakest section it does have some value- there’s utility in having a reasonable grasp of the behaviour and in particular the speed of change under various circumstances in governments- the way governments behave and react in the future will set the regulatory environment for future technological development, and the way they behave in geopolitics affects risk from political instability, both as a civilisation risk in itself and as something that could require mitigation in other work. There was an ongoing line of questioning about how good it is, exactly, to have a massive chunk of AGI safety orgs in one coastal American city (in particular during the worst of the North Korea stuff), and a good model for that is useful for deciding whether it’s worth trying to fund the creation and expansion and focusing of orgs elsewhere as a “backup”, for example, which is a decision that can be taken individually on the basis of a good grasp of how concerned you should be, exactly, about particular geopolitical issues.
These world politics questions are probably not perfectly optimised for that (I had to avoid anything on NK in particular due to the current rate of change), and it’d be nice to find better ones, and maybe more other useful questions and shrink the section further next year. I think they probably have some value to practice predicting on, though.

Prediction Contest 2018

jbeshirApr 30, 2018, 6:26 PM

9 points

4 comments3 min readLW link

jbeshir Mar 31, 2018, 10:46 PM
1 point
in reply to: rk’s comment on: 2018 Prediction Contest—Propositions Needed
I need to take a good look over what GJO has to offer here- I’m not sure if running a challenge for score on it would meet the goals here well (in particular I think it needs to be bounded in amount of prediction it requires in order to motivate doing it, and yet not gameable by just doing easy questions, and I’d like to be able to see what the probability assignments on specific questions were), but I’ve not looked at it closely with this in mind. I should at least hopefully be able to crib a few questions, or more.

jbeshir Mar 31, 2018, 10:42 PM
1 point
in reply to: Scott Alexander’s comment on: 2018 Prediction Contest—Propositions Needed
Sounds good. I’ve looked over them and I could definitely use a fair few of those.

jbeshir Mar 31, 2018, 10:34 PM
1 point
in reply to: Ben Pace’s comment on: 2018 Prediction Contest—Propositions Needed
Thanks for letting me know! I’ve sent them a PM, and hopefully they’ll get back to me once they’re free.

2018 Prediction Contest—Propositions Needed

jbeshirMar 29, 2018, 3:02 PM

7 points

6 comments4 min readLW link

jbeshir May 30, 2017, 3:48 PM
14 points
0
on: Dragon Army: Theory & Charter (30min read)
On the positive side, I think an experiment in a more centrally managed model makes sense, and group activity that has become integrated into routine is an incredibly good commitment device for getting the activity done- the kind of social technology used in workplaces everywhere that people struggle to apply to their other projects and self-improvement efforts. Collaborative self-improvement is good; it was a big part of what I was interested in for the Accelerator Project before that became defunct.

On the skulls side, though, I think the big risk factor that comes to mind for me for any authoritarian project wasn’t addressed directly. You’ve done a lot of review of failed projects, and succeeded projects, but I don’t get an impression you’ve done much of a review of abusive projects. The big common element I’ve seen in abusive projects is that unreasonable demands were made that any sensible person should have ‘defected’ on- they were asked things or placed under demands which from the outside and in retrospect staying in the group was in no way worth meeting- and people didn’t defect. They stayed in the abusive situation.

A lot of abusive relationships involve people trading off their work performance and prospects, and their outside relationship prospects, in order to live up to commitments made within those relationships, when they should have walked. They concede arguments when they can’t find a reason that will be accepted because the other person rejects everything they say, rather than deciding to defect on the personhood norm of use of reasons. I see people who have been in abusive relationships in the past anxiously worrying about how they will find a way to justify themselves in circumstances where I would have been willing to bite the bullet and say “No, I’m afraid not, I have reasons but I can’t really talk about them.”, because the option of simply putting their foot down without reasons- a costly last resort but an option- is mentally unavailable to them.

What I draw from the case studies of abusive situations I’ve encountered, is that humans have false negatives as well as false positives about ‘defection’; that is, people maintain commitments when they should have defected as well as defecting when they should have maintained commitments. Some of us are more prone to the former, and others are more prone to the latter. The people prone to the former are often impressively bad at boundaries, at knowing when to say no, at making a continually updated cost/benefit analysis to their continued presence in an environment, at protecting themselves. Making self-protection a mantra indicates that you’ve kind of seen a part of it, but the overall model being “humans defect on commitments too much” rather than “humans are lousy at knowing when to commit and when not to” seems like it will miss consideration of what various ideas will do with false negatives often.

The rationalist community as a whole probably is mostly people with relatively few false negatives and mostly false positives. Most of us know when to walk and are independent enough to be keeping an eye on the door when things get worrying, and have no trouble saying “you seem to be under the mistaken impression I need to give you a reason” if people try to reject our reasons. So I can understand failures the other way not being the most salient thing. But the rationalist community as a whole is mostly people who won’t be part of this project.

When you select out the minority who are interested in this project, I think you will get a considerably higher rate of people who fail in the direction of backing down if they can’t find a reason that (they think) others will accept, in the direction of not having good boundaries, and more generally in the direction of not ‘defecting’ enough to protect themselves. And I’ve met enough of them in rationalist-adjacent spaces that I know they’re nearby, they’re smart, they’re helpful, some are reliable, and they’re kind of vulnerable.

I think as leader you need to do more than say “protect yourself”. I think you need to expect that some people you are leading will /not/ say no when they should, and you won’t successfully filter all of them out before starting no more than you’ll filter all people who will fail in any other way out before starting. And you need to take responsibility for protecting them, rather than delegating it exclusively for them to handle. To be a bit rough, “protect yourself” seems like trying to avoid part of the leadership role that isn’t actually optional: that if you fail in the wrong way you will hurt people, and you as leader are responsible for not failing in that way, and 95% isn’t good enough. The drill instructor persona does not come off as the sort of person who would do that- with the unidirectional emphasis on committing more- and I think that is part of why people who don’t know you personally find it kinda alarming in this context.

(The military, of course, from which the stereotype originates, deals with this by simply not giving two shits about causing psychological harm, and is fine either severely hurting people to turn them into what it needs or severely hurting them before spitting them out if they are people who are harmed by what it does.)

On the somewhat more object level, the exit plan discussed seems wildly inadequate, and very likely to be a strong barrier against anyone who isn’t one of our exceptional libertines leaving when they should. This isn’t a normal house share, and it is significantly more important than a regular house share that people are not prevented from leaving by financial constraints or inability to find a replacement who’s interested. The harsh terms typical of an SF house share are not suitable, I think.

The finding a replacement person part seems especially impractical, given most people trend towards an average of their friends and so if their friends on one side are DA people, and they’re unsuited to DA, their other friends are probably even more unsuited to DA on average. I would strongly suggest taking only financial recompense on someone leaving for up to a limited number of months of rent if a replacement is not secured, and either permitting that recompense to be paid back at a later date after immediate departure, or requiring it as an upfront deposit, to guarantee safety of exit.

If there are financial costs involved with ensuring exit is readily available, there are enough people who think that this is valuable that it should be possible to secure capital for use in that scenario.

jbeshir Mar 7, 2013, 2:34 PM
−1 points
in reply to: private_messaging’s comment on: Open thread, February 15-28, 2013
Assuming by “it” you refer to the decision theory work, that UFAI is a threat, Many Worlds Interpretation, things they actually have endorsed in some fashion, it would be fair enough to talk about how the administrators have posted those things and described them as conclusions of the content, but it should accurately convey that that was the extent of “pushing” them. Written from a neutral point of view with the beliefs accurately represented, informing people that the community’s “leaders” have posted arguments for some unusual beliefs (which readers are entitled to judge as they wish) as part of the content would be perfectly reasonable.

It would also be reasonable to talk about the extent to which atheism is implicitly pushed in stronger fashion; theism is treated as assumed wrong in examples around the place, not constantly but to a much greater degree. I vaguely recall that the community has non-theists as a strong majority.

The problem is that this is simply not what the articles say. The articles imply strongly that the more unusual beliefs posted above are widely accepted- not that they are posted in the content but that they are believed by Less Wrong members, part of the identity of someone who is a Less Wrong user. This is simply wrong. And the difference is significant; it is incorrectly accusing all people interested in the works of a writer of being proponents of that writer’s most unusual beliefs, discussed only in a small portion of their total writings. And this should be fixed so they convey an accurate impression.

The Scientology comparison is misleading in that Scientology attempts to use cult practices to achieve homogeneity of beliefs, whereas Less Wrong does not- the poll solidly demonstrates that homogeneity of beliefs is not a thing which is happening. A better analogy would be a community of fans of the works of a philosopher who wrote a lot of stuff and came to some outlandish conclusions in parts, but the fans don’t largely believe that outlandish stuff. Yeah, their outlandish stuff is worth discussing- but presenting it as the belief of the community is wrong even if the philosopher alleges it all fits together. Having an accurate belief here matters, because it has greatly different consequences. There are major practical differences in how useful you’d expect the rest of the content to be, and how you’d perceive members of the community.

At present, much of the articles are written as “smear pieces” against Less Wrong’s community. As a clear and egregious example, it alleges they are “libertarian”, for example, clearly a shot at LW given RW’s readerbase, when surveys tell us that the most common political affiliation is “liberalism”, and while “libertarianism” is second, “socialism” is third. It does this while citing one of the surveys in the article itself. Many of the problems here are not subtle.

If by “it” you meant the evil AI from the future thing, it most certainly is not “the belief pushed by the organization running this place”; any reasonable definition of “pushing” something would have to meancommunicating it to people and attempting to convince them of it, and if anything they’re credibly trying to stop people from learning about it. There are no secret “higher levels” of Less Wrong content only shown to the “prepared”, no private venues conveying it to members as they become ready, so we can be fairly certain given publicly visible evidence that they aren’t communicating it or endorsing it as a belief to even ‘selected’ members.

It doesn’t obviously follow from anything posted on Less Wrong, it requires putting a whole bunch of parts together and assuming it is true.

jbeshir Mar 6, 2013, 5:12 PM
0 points
in reply to: private_messaging’s comment on: Open thread, February 15-28, 2013
The pattern matching’s conclusions are wrong because the information it is matching on is misleading. The article implied that there was widespread belief that the future AI should be assisted, and this was wrong. Last I looked it still implied widespread support for other beliefs incorrectly.

This isn’t an indictment of pattern matching so much as a need for the information to be corrected.

jbeshir Feb 27, 2013, 7:06 PM
6 points
in reply to: David_Gerard’s comment on: Open thread, February 15-28, 2013
It would be nice if you’d also address the extent to which it misrepresents other LessWrong contributors as thinking it is feasible or important (sometimes to the point of mocking them based on its own misrepresentation). People around LessWrong engage in hypothetical what-if discussions a lot; it doesn’t mean that they’re seriously concerned.

Lines like “Though it must be noted that LessWrong does not believe in or advocate the basilisk … just in almost all of the pieces that add up to it.” are also pretty terrible given we know only a fairly small percentage of “LessWrong” as a whole even consider unfriendly AI to be the biggest current existential risk. Really, this kind of misrepresentation of alleged, dubiously actually held extreme views as the perspective of the entire community is the bigger problem with both the LessWrong article and this one.

jbeshir Jan 3, 2013, 4:00 AM
1 point
in reply to: Bugmaster’s comment on: Intelligence explosion in organizations, or why I’m not worried about the singularity
First, examining the dispute over whether scalable systems can actually implement a distributed AI...

This is one reason why even Google’s datastore, AFAIK, does not implement exactly this kind of architecture—though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance.

That’s untrue; Google App Engine’s datastore is not built on exactly this architecture, but is built on one with these scalability properties, and they do not inhibit its operation. It is built on BigTable, which builds on multiple instances of Google File System, each of which has multiple chunk servers. They describe this as intended to scale to hundreds of thousands of machines and petabytes of data. They do not define a design scaling to an arbitrary number of levels, but there is no reason an architecturally similar system like it couldn’t simply add another level and add on another potential roundtrip. I also omit discussion of fault-tolerance, but this doesn’t present any additional fundamental issues for the described functionality.

In actual application, its architecture is used in conjunction with a large number of interchangeable non-data-holding compute nodes which communicate only with the datastore and end users rather than each other, running identical instances of software running on App Engine. This layout runs all websites and services backed by Google App Engine as distributed, scalable software, assuming they don’t do anything to break scalability. There is no particular reliance of “special properties” of the data being stored, merely limited types of searching of the data which is possible. Even this is less limited than you might imagine; full text search of large texts has been implemented fairly recently. A wide range of websites, services, and applications are built on top of it.

The implication of this is that there could well be limitations on what you can build scalably, but they are not all that restrictive. They definitely don’t include anything for which you can split data into independently processed chunks. Looking at GAE some more because it’s a good example of a generalised scalable distributed platform, the software run on the nodes is written in standard Turing-complete languages (Python, Java, and Go) and your datastore access includes read and write by key and by equality queries on specific fields, as well as cursors. A scalable task queue and cron system mean you aren’t dependent on outside requests to drive anything. It’s fairly simple to build any such chunk processing on top of it.

So as long as an AI can implement its work in such chunks, it certainly can scale to huge sizes and be a scalable system.

And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).

And as I demonstrated, O(n log n) is big enough for a Singularity.

And now on whether scalable systems can actually grow big in general...

You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency.

Speed of light as an issue is not a problem for building huge systems in general, so long as the number of roundtrips rises as O(n log n) or less, because for any system capable of at least tolerating roundtrips to the other side of the planet (few hundred milliseconds), it doesn’t become more of an issue as a system gets bigger, until you start running out of space on the planet surface to run fibre between locations or build servers.

The GAE datastore is already tolerating latencies sufficient to cover distances between cities to permit data duplication over wide areas, for fault tolerance. If it was to expand into all the space between those cities, it would not have the time for each roundtrip increase until after it had filled all the space between them with more servers.

Google and Amazon are not at all forced to build data centres in different parts of the Earth to reduce latency; this is a misunderstanding. There is no technical performance degradation caused by the size of their systems forcing them to need the latency improvements to end users or the region-scale fault tolerance that spread out datacentres permit. They can just afford it more easily. You could argue there are social/political/legal reasons they need it more, higher expectations of their systems and similar, but these aren’t relevant here. This spreading out is actually largely detrimental to their systems since spreading out this way increases latency between them, but they can tolerate this.

Heat dissipation, power generation, and network cabling needs all also scale as O(n log n), since computation and communication do and those are the processes which create those needs. Looking at my previous example, the amount of heat output, power needed, and network cabling required per amount of data processed will increase by maybe an order of magnitude in scaling such a system upwards by tens of orders of magnitude, 5x for 40 orders of magnitude in the example I gave. This assumes your base amount of latency is still enough to cover the distance between the most distant nodes (for an Earth bound system, one side of the planet to the other), which is entirely reasonable latency-wise for most systems; a total of 1.5 seconds for a planet-sized system.

This means that no, these do not become an increasing problem as you make a scalable system expand, any more so than provision of the nodes themselves. You are right in that that heat dissipation, power generation, and network cabling mean that you might start to hit problems before literally “running out of planet”, using up all the matter of the planet; that example was intended to demonstrate the scalability of the architecture. You also might run out of specific elements or surface area.

These practical hardware issues don’t really create a problem for a Singularity, though. Clusters exist now with 560k processors, so systems at least this big can be feasibly constructed at reasonable cost. So long as the software can scale without substantial overhead, this is enough unless you think an AI would need even more processors, and that the software could is the point that my planet-scale example was trying to show. You’re already “post Singularity” by the time you seriously become unable to dissipate heat or run cables between any more nodes.

This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today.

HFT systems desire extremely low latency; this is the sole cause of their wish to be close to the exchange and to have various internal scalability limitations in order to improve speed of processing. These issues don’t generalise to typical systems, and don’t get worse at above O(n log n) for typical bigger systems.

It is conceivable that speed of light limitations might force a massive, distributed AI to have high, maybe over a second latency in actions relying on knowledge from all over the planet, if prefetching, caching, and similar measures all fail. But this doesn’t seem like nearly enough to render one at all ineffective.

There really aren’t any rules of distributed systems which says that it can’t work or even is likely not to.

jbeshir Dec 30, 2012, 11:45 PM
6 points
in reply to: Bugmaster’s comment on: Intelligence explosion in organizations, or why I’m not worried about the singularity
Restricting the topic to distributed computation, the short answer is “essentially no”. The rule is that you get at best linear returns, not that your returns diminish greatly. There are a lot of problems which are described as “embarassingly parallel”, in that scaling them out is easy to do with quite low overhead. In general, any processing of a data set which permits it to be broken into chunks which can be processed independently would qualify, so long as you were looking to increase the amount of data processed by adding more processors rather than process the same data faster.

For scalable distributed computation, you use a system design whose total communication overhead rises as O(n log n) or lower. The upper bound here is superlinear, but gets closer to linear the more additional capacity is added, and so scales well enough that with a good implementation you can run out of planet to make the system out of before you get too slow. Such systems are quite achievable.

The DNS system would be an important example of a scalable distributed system; if adding more capacity to the DNS system had substantially diminishing returns, we would have a very different Internet today.

An example I know well enough to walk through in detail is a scalable database in which data is allocated to shards, which manage storage of that data. You need a dictionary server to locate data (DNS-style) and handle moving blocks of it between shards, but this can then be sharded in turn. The result is akin to a really big tree; number of lookups (latency) to find the data rises with the log of the data stored, and the total number of dictionary servers at all levels does not rise faster than the number of shards with Actual Data at the bottom level. Queries can be supported by precomputed indexes stored in the database themselves. This is similar to how Google App Engine’s datastore operates (but much simplified).

With this fairly simple structure, the total cost of all reads/writes/queries theoretically rises superlinearly with the amount of storage (presuming read/write/queries and amount of data scale linearly with each other), due to the dictionary server lookups, but only as O(n log(n)). If you were trying, with current day commodity hard disks and a conceptually simple on-disk tree, a dictionary server could reasonably store information for ten billion shards (500 bytes 10 billion = ~5 TB), two levels of sharding giving you a hundred billion billion data-storing shards, three giving a thousand billion billion billion data-storing shards. Five levels, five latency delays would give you more bottom-level shards than there are atoms on Earth. This is why, while scalability will eventually* limit a O(n log(n)) architecture, in this case because the cost of communicating with subshards of subshards becomes too high, you can run out of planet first.

This can be generalised; if you imagine that each shard performs arbitrary work on the data sent to it, and when the data is read back you get the results of the processing on that data, you get a scalable system which does any processing on a dataset than can be done by processing chunks of data independently from one another. Image or voice recognition matching a single sample against a huge dataset would be an example.

This isn’t to trivialise the issues of parallelising algorithms. Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don’t support the same set of queries as a simple MySQL server because a MySQL server implements some queries by iterating all the data, and there’s no known way to perform them in a scalable way. Instead, software using them finds other ways to implement the feature.

However, scalable-until-you-run-out-of-planet distributed systems are quite possible, and there are some scalable distributed systems doing pretty complex tasks. Search engines are the best example which comes to mind of systems which bring data together and do complex synthesis with it. Amazon’s store would be another scalable system which coordinates a substantial amount of real world work.

The only question is whether a (U)FAI specifically can be implemented as a scalable distributed system, and considering the things we know can be divided or done scalably, as well as everything which can be done with somewhat-desynchronised subsystems which correct errors later (or even are just sometimes wrong), it seems quite likely that (assuming one can be implemented at all) it could implement its work in the form of problems which can be solved in a scalable fashion.

jbeshir Dec 25, 2012, 3:10 AM
1 point
in reply to: kodos96’s comment on: New censorship: against hypothetical violence against identifiable people
This model trivially shows that censoring espousing violence is a bad idea, if and only if you accept the given premise that censorship of espousing violence is a substantial PR negative. This premise is a large part of what the dispute is about, though.

Not everyone is you; a lot of people feel positively about refusing to provide a platform to certain messages. I observe a substantial amount of time expended by organisations on simply signalling opposition to things commonly accepted as negative, and avoiding association with those things. LW barring espousing violence would certainly have a positive effect through this.

Negative effects from the policy would be that people who do feel negatively about censorship, even of espousing violence, would view LW less well.

The poll in this thread indicates that a majority of people here would be for moderators being able to censor people espousing violence. This suggests that for the majority here it is not bad PR for the reason of censorship alone, since they agree with its imposition. I would expect myself for people outside LW to have an even stronger preference in favour of censorship of advocacy of unthinkable dangerous ideas, suggesting a positive PR effect.

Whether people should react to it in this manner is a completely different matter, a question of the just world rather than the real one.

And this is before requiring any actual message be censored, and considering the impact of any such censorship, and before considering what the particular concerns of the people who particularly need to be attracted are.

jbeshir

Petrov Day Cel­e­bra­tion 2019 - Oxford Campsite

Pre­dic­tion Con­test 2018: Scores and Retrospective

Petrov Day Cel­e­bra­tion 2018 - Oxford Campsite

Pre­dic­tion Con­test 2018

2018 Pre­dic­tion Con­test—Propo­si­tions Needed

Petrov Day Celebration 2019 - Oxford Campsite

Prediction Contest 2018: Scores and Retrospective

Petrov Day Celebration 2018 - Oxford Campsite

Prediction Contest 2018

2018 Prediction Contest—Propositions Needed