On Scaling Mental Models

DmitryOlshansky · on March 22, 2020

The problem with most great projects written by small group of brilliant people in any language:

- it introduces its own vocabulary, it looks like gibberish until you build up the context

- since the group is small and gifted they have no problem creating and using the power tools of their own

- there is no documentation so your only way is through experimentation and direct communication

- the folks who have built it had to move on to solve another (bigger) problem, leaving no traces of what the vocabulary is, where the design was headed, what should be changed with the new requirements and again next to zero documentation

erik_seaberg · on March 23, 2020

I don't see how learning a vocabulary could ever be harder than continually trying to get the same concepts across without having vocabulary.

hinkley · on March 22, 2020

Circular reasoning is hard to manage out of a project. Circular reasoning that has never been articulated is much worse.

Some of the best ergonomic improvements to my own code have come from trying to explain why my code is so stupid and figuring out that there is a fix that’s easier than this apology. When it’s three people there are no apologies even offered, and almost no feedback.

amboo7 · on March 22, 2020

3. You can actually try and understand most of what's been written through reading, given that you're allowed the time for that. It's an investment, prevents technical debt and promotes progress. No one is born with the mediocre skills of big teams, one can (almost) always improve.

TuringTest · on March 22, 2020

The problem with powerful languages is they're great for allowing you to create new abstractions, but they're terrible at piercing that abstraction and let you know what's going on inside when you need it (such as, when learning what the abstraction means when you find it for the first time).

Current modern languages are built around the false dicotomy that you have to choose between one of two representations, like those in the provided example:

1. mydict1.merge(mydict2, handle_duplicates=(key, v1, v2) => v1 + v2)

2. mydict1 + mydict2

A mid-level language will use the first style so that programmers will know what's going on, but then the team will be stuck to using that style everywhere even when they already understand that it's "oh, just adding two dictionaries". Powerful languages let you use the second style but then, as the article explains, it's hard for a new programmer to tell what's the mental model behind that complex operation.

A good powerful Lisp-like language should allow you to build the second abstraction in terms of the first, and use the simple syntax everywhere, but then switch between both representations easily.

I.e. you should be able to inspect the meaning of the more abstract syntax by expanding its definition in place and see how it works in context. Programmers typically address this shortcoming with support of powerful (yet ad-hoc) IDE's runtime debugging tools. But why that approach is not integrated into the language itself, I have no idea.

hinkley · on March 22, 2020

My most recent Orwellian moment was reading about an old heuristic algorithm with high complexity that has been abandoned as the middle ground between a simpler heuristic and much more complex ones because of its computational complexity.

The problem as I see it is in how they represent the data in a way that makes one think of linear algebra. Refinements brought the naive original implementation down to cubic time, and the literature, as far as I can tell, stops there.

But here’s the thing, when I tried to unpack the logic to figure out what is actually going on under the hood, I realized that a quadratic subproblem could be restated as a sorting problem. Sorting is very much nlogn these days, if not lower due to radix sort being an option.

Which means that this heuristic has the same computational complexity as the naive solution. I’m not saying I’ve discovered anything new, except perhaps that this algorithm is homomorphic to another one. It’s just that it’s so stuck in one representation that you can’t see it for what it is unless you stare pretty hard at it.

wyager · on March 22, 2020

Another strategy: Instead of creating new abstractions willy-nilly (which will probably be "wrong" in some sense you won't discover until much later, when it's too late to drop it), you base your abstractions off something that selects for "correct" low-entropy abstractions, e.g. by stealing ideas from math.

_8ljf · on March 22, 2020

By all means, steal ideas from math. (Great artists, etc.) Math has had a couple thousand years practice in learning how (and how not to) express abstract ideas as formal written text.

But don’t abuse math. Like redefining summation to mean any old thing under the sun. Least of all when it’s a fricking union, for which the math symbol is ‘∪’, not ‘＋’.

noelwelsh · on March 22, 2020

Both set union and number addition are examples of monoids, a maths concept from the field of abstract algebra. So there is a maths abstraction that unites these ideas.

xyelos · on March 22, 2020

Uhh quick limiting thing here (although I agree that the addition symbol is fine): the existence of a unifying math concept alone does not justify using the "+" symbol. By that argument, notating permutation conjugation with a "+" is fine, but that would be a notational sin as the "+" operator is reserved for commutative operations.

However, with that disclaimer, noting that both operations form monoids and both are commutative justifies the "+".

wyager · on March 23, 2020

Haskell doesn't use the + symbol for monoids, it uses <>.

melling · on March 22, 2020

Union symbol is not on the keyboard.

Sure, make it easy to type then we can use it instead of +

However, anyone who has been programming for a while should be able to puzzle out what + is doing.

jchook · on March 22, 2020

Emojis aren’t on the keyboard either but folks don’t seem to have a problem typing them, even on desktop.

Funny that software engineers can figure out complex parallelism and algorithms, but typing ∪ as a function name is next to blasphemy.

carapace · on March 22, 2020

"Computer Science could be called the post-Turing decline in the study of formal systems."

_8ljf · on March 22, 2020

Bad example is bad example. Programmers are awful for using algebraic symbols to describe non-algebraic semantics. Algebra is already its own language; a lingua franca of math. If you’re abusing it for a different purpose then you’re doing both of them wrong.

Otherwise, article is basically arguing we should drag everyone down to the lowest-common denominator “to be safe”, rather than allowing individuals and groups to raise themselves up as appropriate in order to tackle their given problems more effectively.

That’s group-think, authoritarianism, and reducing expert knowledge workers to anonymous plug-n-play “human resources”. What’s the software equivalent to “Churnalism”? Because that’s how you get the software equivalent to “Churnalism”.

Now, if you want to talk about the criticality of Human Communications, and how utterly atrocious many programmers are at it, even amongst their own (never mind with “outsider” stakeholders like users and management), then I’m all on board with that discussion. Good and bad use of abstraction (trade jargon), effective knowledge “transfer” (really reconstruction), and successful team collaboration (especially heterogeneous teams, which are vital to solving the actual problem at hand). This is epically important stuff, and this profession has so much yet to learn.

But rationalizing away “this is how we’ve always done it so this is how we will always do it”? Don’t be surprised when the rest of the world treats you as cheap, easy, disposable code-monkeys, paying you peanuts and zero respect. That’s the deal you bought into yourselves, and you have no-one to blame but yourselves.

topologistics · on March 22, 2020

> What’s the software equivalent to “Churnalism”?

Basically anything consumer-facing from the past 15 years, especially if it's "mobile"

jasode · on March 22, 2020

>you should be able to inspect the meaning of the more abstract syntax by expanding its definition in place and see how it works in context. [...] But why that approach is not integrated into the language itself, I have no idea.

Are you asking why _all_ programming languages don't make their own Abstract Syntax Tree a 1st-class concept for self-reflection and manipulation of its own syntax like Lisp?

Because there are tradeoffs to that approach. If one thinks there are zero tradeoffs to Lisp's philosophy of "code is data", I contend one doesn't fully understand Lisp. Yes, self-inspection is a powerful device but even Lisp programmers[1] keep inventing new languages for others to use that do not have Lisp-like ASTs as 1st-class concepts. I ask people to really think deeply on why that's been happening for decades and comment with any insights.

[1] e.g. Guy Steele a Lisp expert, works on creating the new Java language at Sun in 1994.

TuringTest · on March 23, 2020

> Are you asking why _all_ programming languages don't make their own Abstract Syntax Tree

No, I'm asking why there isn't _any_ programming language (that I'm aware of) with an integrated IDE where you can make static code rewrite expansions (replace an function call instance with its definition) at writing time, and instead depend on waiting until runtime to have an in-memory stack and navigate between functions to see how a function call will run.

kazinator · on March 23, 2020

The main reason is that designers are not even aware of the techniques. You can get CS degree that includes compiler work, and have a head full of LALR(1), without knowing anything about the Lisp approach.

In my Alma Mater, at the time, there was a third year course in programming languages that had Scheme exposure. The follow-up fourth year course in compiler construction was all C hacking with Lex and Yacc, and that was that.

I was able to wriggle out of taking the prerequisite, too.

AzzieElbab · on March 22, 2020

Curious what happens when V1 and V2 cannot be "plused"

mason55 · on March 22, 2020

In a statically-typed language you’d just get a compile error. The compiler knows the types of the values in the map and whether or not the (+) operation is valid for them. You’d likely need a little more code than what’s in the GP post, something restricting the values in your map to types that implement “addable” or whatever the terminology is for your language.

In a dynamically-typed language you’d get some kind of runtime error.

Just think of (+) like any other function and what would happen in your language if you call a function with types that the function doesn’t support.

AzzieElbab · on March 22, 2020

Well, in Scala with default predef it would depend on the order or the arguments. If first one is a string it would just append string representation of the second one. One could use a random generator to guess what would happen in JavaScript. I do not know what Python does in these cases

AzzieElbab · on March 22, 2020

Sorry for being cryptic, the point was that I prefer appenders being explicitly documented in types or method signatures instead of some built-in magic

Terr_ · on March 22, 2020

Two more possibilities to add onto the pile: Transpilers (like Typescript to Javascript) and custom Domain Specific Languages.

mdrachuk · on March 22, 2020

Swift extensions are a perfect way to achieve this.

noelwelsh · on March 22, 2020

The problem is programmers crafting the language to their own mental model. One solution is languages that don't allow much abstraction. The other solution is to craft a shared mental model. This latter approach is the more powerful in my opinion.

Taking the example of adding maps, you (where "you" means the language designers or language community) just need to define what `+` means for primitive types and the composition follows naturally. What does adding maps mean then? It means adding together the primitive types found under each key. If everyone agrees 1) there is a `+` operation and 2) what it means for primitives, there isn't much room for confusion. At some point you might want to give this concept a name and then you have basically reinvented monoids.

This is one of the differences between my experience in untyped functional languages (primarily Scheme) and inexpressive languages (primarily Java) and typed functional languages (primarily Scala). In the former each abstraction was a perfect snowflake crafted to the specific situation. In the latter we just reuse existing abstractions. The problem with the former is you have to learn the meaning and idiosyncrasies of each new abstraction. In the latter you can leverage your existing knowledge in new domains.

hinkley · on March 22, 2020

I’ve heard one of the early Agile luminaries say that if the terminology in your code is different than the terminology of your domain that it’s a code/design smell. Some day, if not already, that impedance mismatch will bite you in the ass.

Architectural astronauts are fond of their own terminology. It’s nuts. Especially if it’s from someone like my most recent perpetrator who tries to use big words all the time and either gets them wrong or uses obscure meanings that nobody else ever uses conversationally. You may be smart pal, but you’re the biggest idiot I know.

At least the last astronaut was personable and had a hint of humility.

Terr_ · on March 22, 2020

> if the terminology in your code is different than the terminology of your domain that it’s a code/design smell.

Hence the idea/label of Domain Driven Design [0].

IMO like unit-testing it's valuable but you can't hope for 100%. There will always be some concepts that are unique to how the data is being packaged or calculated which aren't part of the business-domain, but hopefully they'll be safely locked down beneath domain-centered abstractions.

[0] https://en.wikipedia.org/wiki/Domain-driven_design

pron · on March 22, 2020

> One solution is languages that don't allow much abstraction.

I think we need to be careful here with what you mean by "abstraction." One kind, which we can call "syntactic abstraction" means the following: given a specific language, what portion of its syntactic patterns can be factored into a new construct that "abstracts" all of its instantiations. Languages with macros score very highly here. Another kind, which has been more studied mathematically, is what we can call "algorithmic abstraction" or "behavioral abstraction", which means that given a certain specification of a desirable observable behavior, what portion of the implementations of that behavior can be exposed with the same syntactic construct (the same API). On this front, Haskell has worse abstraction than Java (e.g. if you want to change the implementation of a map to support big data with disk writes, you have to change the syntactic construct), and a language like Rust has terrible algorithmic abstraction, as even different implementation details like memory lifetimes require different constructs. Every additional piece of implementation detail that is exposed in the syntax level (e.g. in the type) hurts abstraction. I don't think that proponents of exposing as much as possible in the type think that's a bad thing, though.

Both of these kinds of abstraction allow you to "leverage existing knowledge in new domains", and both have a certain "mental" cost. I don't think it is possible to relate "abstraction" in general to any bottom-line result, especially as no language choice seems to show any big impact one way or another, but mostly cater to different personal aesthetic preferences.

chrisweekly · on March 22, 2020

I can't parse your last paragraph, in which "Scheme and Java and Scala" (3 things) are compared as "former vs latter" (2 things).

noelwelsh · on March 22, 2020

My bad, I missed a comma.

The former is Scheme and Java. The latter is Scala.

marmaduke · on March 22, 2020

> craft a shared mental model. This latter approach is the more powerful

Isn't the crux of why programming is a technical activity second and a social one first?

AzzieElbab · on March 22, 2020

Wrt to code sample in the original article, are you saying live would be so much easier if everyone knew what monoids are?

noelwelsh · on March 22, 2020

Saying life would be "so much easier" seems like a strong statement, but I expect programming would be. :-)

When I read that example the semantics were obvious to me: adding a map (applying a monoid operation on a map) should add the elements (apply the monoid operation for the elements).

oddthink · on March 22, 2020

But one of the examples of adding maps would be for it to mean "upsert", rather than "use the monoid for the value types". I think that's still a perfectly good monoid over maps (it's not commutative, but that's not a requirement).

And yes, you can argue that using "+" implies commutative, and I can see that, but I think it's still a plausible meaning in this context.

(I'm probably rabbit-holing here, but it's interesting.)

noelwelsh · on March 22, 2020

How you handle this gets into the details of the programming language used for implementation.

The usual implementation technique is type classes, and the issue of having multiple type classes (implementations of, say, monoid) for a given type is known as "type class coherence".

In Scala you could just pass a type class instance explicitly. If Haskell you'd have to change the type.

jart · on March 23, 2020

Notation is the word you want. For example, we can design our language to say ht[x]=y or HashInsert(hTable,key,value), but at the end of the day they both abstract computer science toil don't want to think about, e.g. should it cuckoo probe pascal triangle bucket tree big theta under the hood yadda.

My opinion is that terse notation only gains consensus when it's divinely inspired. That's the word we've traditionally used to describe the np-complete social phenemonon that happens every so often, where some guy's arbitrary mental model manages to allure a lasting self-evident appeal, e.g. LISP, C, and Python. So in some ways, we can think of guys like John McCarthy as modern programming prophets.

One interesting thing I noticed, reading John McCarthy's original paper, is he was actually trying his darndest to not introduce a new notation. He wanted LISP code to look normal, since it's usually not a good thing to come across to other smart people as claiming divine inspiration (see Dunning Kruger).

I disagree with the conventional wisdom though. If you'd want to take the more powerful approach of crafting a shared mental model, my advice would be one of encouragement. You can absolutely be the next John McCarthy. Folks only use wordy notation like HashInsert(hTable,key,value) because they want to go the safer route of being a small successful part of a large corporate machine.

jb1991 · on March 22, 2020

This article also supports why the Go language has been so widely adopted and successful, despite its young age. The language can be rather limiting and somewhat disappointing for an individual developer, but offers great benefits to the agility of a team.

wyager · on March 22, 2020

> why the Go language has been so widely adopted and successful

That's because it has tons of funding and support from Google, not because people did experiments and found that it worked better.

msla · on March 22, 2020

If marketing was all it took, .Net would have destroyed Java back when Sun was dying and Microsoft was still the monopolistic gorilla of the corporate universe.

AzzieElbab · on March 22, 2020

Well, this logic should also apply to Dart which is not seeing same level of adoption as Go.

Disclaimer - I do not use Go.

amboo7 · on March 22, 2020

How about this? http://www.linusakesson.net/programming/kernighans-lever/ You can try with a smaller team and make everyone improve rather than complain.

karmakaze · on March 22, 2020

I'd read the 'not clever enough to debug' bit before but not this description. Like the diagram on 'flow state' though it usually a combination of lack of time as well as proficiency that leads to frustration.

drewcoo · on March 22, 2020

So language/tooling choice should be a sort of a Harrison Bergeron* affair, to make sure that the least able to program will still be able? I'm not so sure I agree with that. Nor do I agree with its obvious counter-proposal, meritocracy.

I have heard of teams using "communication" to bridge these gaps but I know little of that technology.

* https://en.wikipedia.org/wiki/Harrison_Bergeron

erik_seaberg · on March 22, 2020

You can't let anyone work at the top of the learning curve (where the payoff is, where professionals belong) unless the whole team is capable of getting there. This is part of what makes false positive signals in hiring so damaging.

throwaway287391 · on March 22, 2020

> I didn’t use [vim] in 2017. That’s because my employer started doing more pairing, and nobody could pair with me. It was bad enough for the Atom users, but even the other vimmers couldn’t pair with me. They’d press something expecting the vanilla vim action and get something completely unexpected. It’d drive them crazy.

I've never done pair programming (other than one project when I was in school and it was required -- bit of a nightmare IMO), but I thought the way it typically worked was you switch off who "drives" (i.e. actually types the code) in long intervals. So why couldn't each coder just use their own preferred programming environment? Do they not each have their own machines?

modernerd · on March 22, 2020

That's how it traditionally works, but some see this as a limitation. For example, the pitch for Live Share for VS Code says:

> Each of you can open files, navigate, edit code, highlight, or refactor - and changes are instantly reflected. As you edit you can see your teammate’s cursor, jump to the location of your teammate’s carat, or follow their actions.

https://code.visualstudio.com/blogs/2017/11/15/live-share

I've found that letting both participants inspect and edit code together collaboratively is useful when one participant is acting in more of a teaching role. When the student gets stuck, you can jump in and demo right in the editor instead of dictating code or ideas. It's useful if it's not overused, like the second brake on the instructor's side in a driving lesson.

But if both participants are of equal ability, it just becomes annoying to have dual controls in my experience. Hell is watching other people use computers, but jumping in to effectively grab your colleague's keyboard and mouse and say, “look, I'll do it” can be irritating.

The other main advantage is that, because you're streaming actions and not video, you get a high definition view at all times instead of an image that sometimes breaks up.

I think the OP is possibly talking about pairing in the same physical location on the same machine, though, where custom vim setups can definitely get in the way if you're swapping driver every hour or two.

carapace · on March 22, 2020

> “powerful languages don’t scale”.

The "scale" he's talking about doesn't seem to be "of traffic" or "of data" but rather "of programmers required to work on it".

> it’s hard for other people to work with you. They don’t share your mental model, and they don’t come in with all your initial assumptions. This is somewhat addressable if you all start working on the project together but falls apart when people join on later. The expressivity doesn’t scale.

So that's a failure of documentation, eh?

hinkley · on March 22, 2020

No, it’s often a failure of architecture or design.

People who build systems that only make sense to them are never going to provide good documentation, no matter how often you ask for it. And it’s always describing something that is already ‘done’, so there is very little value as a feedback loop. You just stop asking when you either realize this is all you’re gonna get or figure out how batshit what they’ve described is.

Sunlight is the best disinfectant, they say. Ex post facto documentation provides almost none.

fjfaase · on March 22, 2020

Articles like this are missing an important point. Namely, that generic data structures in programming languages are used to represent rich semantic data models. This is done by a process that I would call 'implementing'. It is a mapping of a certain semantic model to, often a rather limited, set of primitives. Some languages, are completely optimized for one type of data structure. Take for example the relational database, in which everything is a relation.

During the implementation process some knowledge is lost. What the author is describing is a kind of merge operation. If you would know the semantic meaning of the data that is being represented, it is rather obvious what choice should be made. When a number represent a quantity, it is obvious they should be added, but when a number is a kind of identifier, like an article number, it is obvious that they should not be added. And yes, maybe in that case the number should have been better represented as a string. In the process of implementing it is often smart to represent it like a number, because it is much easier to deal with a 64-bit number than a variable length string.

noelwelsh · on March 22, 2020

Your comment in a good example of the very different world views that exist within programming. I'm not trying to bash on your comment, just illustrate the difference and show how this makes communication trickier than we sometimes acknowledge.

In a modern typed language you wouldn't represent an ID as an integer or a string, you'd represent it is an ID. Then define addition as whatever makes sense for IDs. This can mean not defining the operation at all, in which case you can't compile code that tries to add IDs. This is the viewpoint that statically typed FP people take, and the viewpoint that languages like Rust take.

Your view is more inline with untyped languages or languages with weak type systems like Java or Go. (Arguably Java is changing as it adopts more modern features but I don't think the culture is changing as rapidly as the language.)

fjfaase · on March 22, 2020

I thought I was trying to communicate just the opposite, that during implementation all kind of semantic information is lost. One could define a type as a semantic property, which goes much further than specifying it range of valid values. Think for example about the unit of a double value. I have not seen programming languages where units are an essential part of the language. Several billion dollar disasters could have been prevented if such a language had been used. And this is apart from where a language is statically or dynamically typed. Types can also be viewed as annotations that can be used to verify the correctness of an executable specification (program). One could had a type stating that a function terminates with less than C.n^2 operations of a certain type, where n is one of the parameters of the function. These are often things that are reasoned with at the semantic domain, but are lost when the actual implementation is made.

I realize that this requires a different mode of thinking about software engineering than is common among software engineers. I also wonder whether most software engineers could be called engineers, because it looks like it is more a craftsmanship than engineering, like mechanical engineering where they are able to accurately predict the properties of a certain object before it is actually being made with the help of standard methods, such as finite element analyses.

I have to admit that we are working in a far more complex domain, but it seems to me that not much progress is made in the field of software engineering. Probably also because technologies are moving too fast to build a solid foundation and that the demand for solutions is too high.

I am surprised how few software engineer understand that main problem boils down to the fact that computers are too slow for our demands. See: https://www.iwriteiam.nl/AoP.html

msla · on March 22, 2020

Very few use untyped languages anymore, and only to the extent people program in assembly language or modify Deep Legacy stuff in Bliss. It's possible to lose information in a dynamically-typed language, like Python, but that's a matter of not using the language facilities, such as objects, and if you go down that route, you can write stringly-typed Haskell.

noelwelsh · on March 22, 2020

My mistake. I was using "untyped" in the sense it is used in programming language theory, which is the same as the term "dynamically typed" in colloquial programmer speak.

msla · on March 22, 2020

And I'm pointing out how useless of a definition that is, because it renders us unable to distinguish between some languages which do have and use type information and other languages which do not. If a language has enough type information to automatically convert a number into a string, it isn't the same as one where everything is simply a machine word.

Plus, my comment about stringly-typed Haskell demonstrates that it's a property of programs more than languages anyway.

harperlee · on March 22, 2020

This is an interesting hypothesis, and I've also heard something similar said of clojure: it makes lisp just a little bit less opaque (by way of macros not being so popular, introducing {} and [] as a little bit of extra syntax, emphasizing a very targeted philosophy on several aspects, etc.

trabant00 · on March 22, 2020

Just taking on the vim custom keys thing: that was a problem of standardisation. That is why it did not scale. Nothing to do with mental models.

mjfisher · on March 22, 2020

I think that was more or less still the point - more expressive power (including custom key bindings) allows you to build more complicated abstractions with less obvious behaviours. It's more difficult to maintain a mental model of those less obvious behaviours.