The situation is clear. There is a great risk to the livelihoods and bargaining power of workers everywhere. This risk is driven by a race dynamic that is accelerating. In tech we can see this earlier than others because we are close to the technology at the heart of this.
This is quickly becoming one of the largests threats to the public in history and the concentration of power of this trajectory threatens democracy. Irreversable shifts in the structure of power are on the table.
> Literally the 3rd or 4th thing you learn about ML is that for any given problem, there is an ideal model size.
From my understanding this is now outdated. The deep double descent research showed that although past a certain point performance drops as you increase model size, if you keep increasing it there is another threshold where it paradoxically starts improving again. From that point onwards increasing the parameter count only further improves performance.
That isn't what that research says at all. What that research says is that running the same training data through multiple times improves training. There is still an ideal model size though, it is just impacted by the total volume of training data.
https://arxiv.org/pdf/1912.02292
"We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better."
That is the first sentence of the abstract. The first graph shown in the paper backs it up.
Looking into it further, it seems that typical LLMs are in the first descent regime anyway though so my original point is not too relevant for them anyway it seems. Also it looks like the second descent region doesn't always reach a lower loss than the first, it appears to depend on other factors as well.
I can somewhat understand people developing AGI, but directly working on superintelligence is on extremely shaky ethical ground. A good proportion of AI researchers and philosophers believe superintelligence stands a significant chance of displacing humanity and it is widely regarded as one of the most, if not the most, dangerous technology yet to be created.
> OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …)
As far as I can tell o1 was based on Q-star, which could likely be Quiet-STaR, a CoT RL technique developed at Stanford that OpenAI may have learned about before it got published. Presumably that's why they never used the Q-Star name even though it had garnered mystique and would have been good for building hype. This is just speculation, but since OpenAI haven't published their technique then we can't know if it really was their innovation.
The author's example for x^2 + x could be written with the first two symbols swapped. With this it looks fine to me. Putting the 2 first here is like putting the x first in "2x" such that it becomes "x2". I think also maybe if the lines above and below had curved ends so you could see where they start and end clearly then this could be not so bad notation.
>When a person "has the concept `7`" they can reason thus: "6 eggs would be fewer", "7 is an odd number", "7 is a whole quantity", "with 7 less of 10, i'd have 3" etc.
I just input this into GPT-3. Its responses are in italics, this is first try no rewriting or retrying anything:
This is a test of understanding of the concept of "the number 7".
Question: Is 7 odd?
Answer: Yes, 7 is an odd number.
Question: Is 6 eggs fewer than 7 eggs?
Answer: Yes, 6 eggs is fewer than 7 eggs.
Question: With 7 less of 10, what would I have?
Answer: I would have 3.
Question: Is 7 a whole quantity?
Answer: Yes, 7 is a whole quantity.
This is mostly a joke because I think I understand where you are coming from (and that you are hypothesising that gpt3's responses are an elaborate trick of sorts). But I don't believe AI has to take the same route as human intelligence, and I don't think we really understand what a concept is or how it behaves from a signal/data perspective, but I think that may be inconsequential for creating general AI.
Also people are can be really stupid sometimes and also have failures, and the concepts that people hold can be incorrect or flawed etc. So it may be useful also to compare human failures with AI failures, rather than just AI failures with human successes.
I think the failures of people spouting hype and failing to deliver in ML has absolutely nothing to do with the real and immense progress which is happening in the field concurrently. I don't understand how one can look at GPT-3, DALL-E2, alpha go, alpha fold, etc and think hmmm... this is evidence of an AI winter. A balanced reading of the season imo suggests that we are in the brightest AI summer and there is no sign of even autumn coming. At least on the research side of things.
The difference between the two views could be summarized in a textbook intro from twenty years ago: here is a list of problems that are not (now) AI. Back then it would have included chess, checkers and other games that were researched for their potential to lead to AI. In the end they all fell to specific methods that did not provide general progress. While the current progress on image related problems is great, if it does not lead to general advances then an AI winter will follow.
I disagree. If we find a particular architecture is good for Chess, and another for image generation, then so be it. We would still have solved important problems. We are seeing both general and specific approaches improving rapidly. I don't think the AI winter was defined by a failure to reach AGI, but rather that they reached a Plateau and produced nothing of great commercial or even intellectual value for some years, while other computer science fields thrived. I would say the situation is the exact opposite right now.
> Back then it would have included chess, checkers and other games that were researched for their potential to lead to AI.
20 years ago (2002) Deep Blue had beating reigning world chess champion Kasparaov was old news.
Unsolved problems were things like unconstrained speech-to-text, image understanding, open question answering on text etc. Playing video games wasn't a problem that was even being considered.
I was working in an adjacent field at the time, and at that point it was unclear if any of these would ever be solved.
> In the end they all fell to specific methods that did not provide general progress.
In the end they all fell to deep neural networks, with basically all progress being made since the 2014 ImageNet revolution where it was proven possible to train deep networks on GPUs.
Now, all these things are possible with the same NN architecture (Transformers), and in a few cases these are done in the same NN (eg DALL-E 2 both understands images and text. It's possible to extract parts of the trained NN and get human-level performance on both image and text understanding tasks).
> While the current progress on image related problems is great, if it does not lead to general advances then an AI winter will follow.
"current progress on image related problems is great" - it's much more broad than that.
"if it does not lead to general advances" - it has.
A very telling example, since we now have methods like Player of Games which apply a single general method to solve chess, checkers, ALE, DMLab-30, poker, Scotland Yard... And the diffusion models behind DALL-E apply to generative modeling of pretty much everything, whether audio or text or image or multimodal.
There was an abomination of a live action Pikachu movie some time ago. When I google "realistic pikachu" I get images exactly like this from the movie but not gross.
In fact this photo is exactly what you get when you photoshop the face of an ugly chihuahua unto a Pikachu plushie head and add a yellow brushed hamster body. And a cape. Literally that is what you're looking at.
It understood your prompt and amalgamated the right source photos into this nightmare fuel. Jesus wept.
Yeah, it's still impressive to be able to imitate those styles and add a blue cape that didn't exist in the movies, along with chihuahua eyes. It also appears to be higher definition than Detective Pikachu CG. I'm curious if you could do the same for all 150 original Pokemon, even those for which realistic CG representations don't exist. Would it be able to take the cartoon version of Farfetch'd or Psyduck or a more obscure one and achieve the same realism, without the reference from the deep dataset?
Well to my eye it's realism beyond anything that I could find. Mind you I didn't search for that long so there might be something there if I was to delve deeper.
I am pretty familiar with photoshop, and while I'm not an expert, I would find making something like this really difficult. Anything is possible with photoshop, but some things are very hard.
> In fact this photo is exactly what you get when you photoshop the face of an ugly chihuahua unto a Pikachu plushie head and add a yellow brushed hamster body. And a cape. Literally that is what you're looking at.
i guess some people are overhyped, but it's cool that this can do that. Previously, it took a trained human.
If this is the exact image you wanted and are entirely satisfied for it, great. But what people are reacting to is that it is outputting interesting images at all.
What are you going to do with this cape wearing realistic Pikachu that is actually a picture of a hamster?
Typically the trained human has something specific in mind. And if the client isn't satisfied they will torture them with countless requests for adjustments. So right now this is of limited use.
To me what is far far far more interesting is that Dall-E possibly understands the concept of what a Pikachu is supposed to be. That is downright creepy, and fascinating. I suspect that this visual aspect to things after people get over the clipart generation might find more functional utility as a way to see through the "model eyes" so to speak. To visualize the model itself. That could unlock a lot of doors in how training is done.
Maybe in the future you could train it on textbooks and prompt it for a picture of a molecule. Now that would be something. Especially if you start feeding it data from experiments.
> Typically the trained human has something specific in mind. And if the client isn't satisfied they will torture them with countless requests for adjustments. So right now this is of limited use.
Confused as to why you think you cannot do this with DALL•E?
Human artists also do a whole lot of mimicry. One could look at art produced by many artists and say that it is just things stitched together from pre-existing art.
For example the “enterprise vector people” graphics you see on every corporate website. Most human art is extremely repetitive.
AI art seems to be coming from the opposite direction to human artists - from a starting position of maximum creativity and weirdness (e.g. early AI art such as Deep Dream looked like an acid trip) and advancements in the field come from toning it down to be less weird but more recognizable as the human concept of “art”.
And DALL-E is impressive exactly because it has traded some of that creativity/weirdness away. But it’s still pretty damn weird.
This is quickly becoming one of the largests threats to the public in history and the concentration of power of this trajectory threatens democracy. Irreversable shifts in the structure of power are on the table.