> With superior efficiency in terms of latency and size, MobileDiffusion has the potential to be a very friendly option for mobile deployments given its capability to enable a rapid image generation experience while typing text prompts. And we will ensure any application of this technology will be in-line with Google’s responsible AI practices.
So I'm interpreting this that it won't ever get released.
I'm interpreting it as they will be adding a layer of safety restrictions. Understandable given the furore of the recent Taylor Swift generated image incident.
Everyone needs to do this and probably is already doing this. Search for "ChatGPT lobotomized" and you'll see plenty of complaints about the safety filters added by OpenAI.
I'm much more comfortable with the idea of AI watermarking images it creates instead of refusing to create images because of "safety", which in practice more often means not wanting to offend anyone. Imagine if word processors like Google Docs refused to write things you wanted to write because of mature themes. The important thing, in my opinion, is to make it a lot more difficult to pass off AI generated content as being authentic and to make provenance traceable if you were to do something like create revenge porn with AI, but not to make AI refuse to create explicit material at all.
It being authentic or not isn't actually important in a lot of cases though. Consider someone like Mia Janin, who recently took her own life after been harassed using deepfakes. Everyone understood that the images weren't "authentic" but their power to cause distress was very real.
Ease of use and accessibility. Think of how we control access to guns even though a baseball bat could also be used to kill or maim someone.
I agree that we should legislate against the aggressors, that's why I'm pointing out the limitations of technical solutions like watermarks. We need extensions to things like revenge pornography laws, if we're talking about legislation, and I don't see any harm in outlawing services that automate the creation of deepfakes.
Of course the only "solution" is that we would universally behind to teach young boys that they are not entitled to women's bodies or their sexuality, but so many grown men apparently disagree that I can't see it happening quickly enough.
I'm one of the grown men who disagree. I don't think treating half the population as pre-criminals, when in reality it's an extremely tiny minority who act in this way, is a particularly good solution. If we were to apply this kind of "solution" to all undesirable behaviours exhibited by deviant minorities of both men and women I doubt there'd be any time for any actual K-12 formal education.
I suggest we clearly specify the class of problem we're trying to solve and come up with a principled solution that would make sense when applied consistently and universally to all problems in that class. I prefer this over coming up with knee-jerk moral panic patches (e.g. "censor generative models so they can't generate this particular thing I find distasteful") or with overly abstract and tangential problem-solutions (e.g. "just teach men not to be big jerks").
I think the central issue here is: what restrictions, if any, should be placed around creating and distributing a likeness of another person? Are we just looking to prohibit pornographic likenesses, or do you think the restrictions should be broader? What's the threshold rule you would apply? Should these rules be medium-specific, or should we also prohibit people from, say, painting a likeness of another person without their consent?
I guess in a US context you'd also have to consider whether it's constitutional to restrict freedom of expression, even the distasteful ones, in this manner.
Edit: Just saw your edit suggesting that I think "men are entitled to women's bodies" (whatever that means). I think I'll end my participation here, not interested in having a bad faith discussion.
I'm not in the US but I understand that there are already laws which limit your expression, such as in cases of CSAM or revenge pornography, which night be the closest analogue.
Personally the limits are similar to that, as I'm personally most interested in fighting sexual harassment. The legislation against revenge pornography already faces and tackles issues of what constitutes pornography and when it becomes illegal to disseminate pornographic images of others, so it's not an intractable problem.
Indeed, we also have precedents for limiting the use of tools for certain purposes. Using deepfake technology to generate images akin to CSAM would already be illegal in the UK, but other broader and everyday examples exist like speed limits for cars.
Edit to respond to yours: I said above that we should teach boys they're not entitled to women's sexuality, but that many men disagree. You said you were one of them. I had meant the disagreement being on the entitlement, but I'm now considering that you took it to mean they disagreed with the education about entitlement. It was a misunderstanding, and I was responding in good faith. I didn't suggest anything about you, I asked if my interpretation of your response was correct.
Fair enough on your edit, I accept it was a misinterpretation and appreciate the clarification.
The precedents you raise are worth considering. They're related but not completely analogous to deepfake porn of real people in my view. CSAM is criminalised due to the direct harm its production inflicts on minors and the deep injury to society that follows. Deepfake CSAM, I presume, has more of an 'obscenity' rationale as there is no actual direct harm inflicted on minors in that case. I suppose you could have a similar obscenity rationale for criminalising deepfake porn but you would then have to accept that pornography in general should be outlawed. An obscenity rationale would also be more supportive of criminal sanctions, as acts of obscenity injure society in addition to individual subjects.
I think revenge pornography is the best analogy here. I assume the policy rationale / theory of harm for criminalising 'revenge porn' (i.e. distributing true intimate private images of another person) is one of two things: (1) violation of the subject's privacy or (2) infliction of psychological harm on the subject. If the policy rationale is (1) then I don't I don't think there's a sound analogy to deepfake porn - deepfakes are fictional and so do not violate the privacy of the subject.
If the rationale is (2), psychological harm, then I could see a similar policy rationale for legislating against deepfake porn. But if psychological harm is your policy rationale then wouldn't it make more sense to directly criminalise the infliction of psychological harm on others regardless of the method used? If we were regulating on a principled and universal basis we should pass a law that criminalises any act, publication or utterance that inflicts psychological harm on another person, rather than using the law to solve single instances of this class of offences. Although I'd strongly disagree with such a law due to the chilling effect it would have on all forms of speech, expression and public commentary I think there's at least a principled argument to be had.
But if you legislate on this principle then you have to grapple with the far reaching implications of such a law - if someone writes some smutty (but fictional) erotic story about me that I find psychologically distressing should they then be thrown in jail? What if they say hurtful things to me that I find psychologically harmful? What if they insult a religion or political candidate, party or ideology that I strongly identify with? We all inflict psychological harm on others from time to time - what should minimum harm threshold be?
Personally, I don't think the criminal law is the answer in either the deepfake or revenge porn cases if the rationale is 'psychological harm'. Although I'm not sure where I stand on the following, I think a civil tort for infliction of psychological harm would be the sanest option if we feel the need to regulate against infliction of psychological harm. It would be analogous to defamation and libel torts, but instead of having to prove economic harm the plaintiff would have to prove some minimum threshold level of psychological harm to become entitled to compensation from the defendant in proportion to the actual provable injury sustained.
My thoughts aside, what is your general theory of harm / principled policy rationale here and, on that basis, what do you think the state's response should be to regulate?
Good information. Then the solution would be to improve harassment legislation rather than limiting the availability of tools. Just as assault is illegal but we don’t require all hammers to be made out of foam.
The number of people who committed suicide after being harassed with memes or emoji must be higher than those harassed with deep fakes. Too bad nobody is interested enough in banning emoji to do a study.
Why stop there? I think we'd all agree that "mean words", either written or spoken, have immense power to "cause distress" and have driven many a person to suicide. We should ban those.
We do. Incitement to violence or "true threats" for example already fall outside of 1st amendment protections. I personally see deepfakes created or disseminated for harassment purposes as an act of violence.
I'm neither referring to "true threats" nor making any kind of argument about the US constitution, so I'm unsure why you're bringing these up. I thought it would have been pretty clear that in the context of driving people to suicide, I was suggesting banning "insulting words". Hope that clarifies things for you.
It seemed like you were making a sarcastic comment about the policing of harmful communications. I mostly hear such arguments from US citizens so wanted to point out that even the US has precedent for limiting such expressions to deter violence.
Can you link to an article where emojis have been pivotal in an harassment case leading to suicide?
They've been around for 30 years, deepfakes as they are today have been around for less than a year. I'm not sure absolute numbers are the best thing to look at either way.
The emoji and meme industry has been hiding these stats for years. Facebook, X, TikTok, Apple, etc have the data you want but they won't give it to you.
We've already seen this progression - they debuted Magic Eraser as a cloud feature, then with the Pixel 8 they got it running locally on the device. But they also introduced Magic Editor with the Pixel 8, running on the cloud, and the Pixel 9 or 10 will probably run it on-device.
1. they made a lot of careful tweaks to the unet network architecture - it seems like they ran many different ablations here ("In total, our endeavor consumes approximately 512 TPUs spanning 30 days").
2. the model distillation is based on previous UFOGen work from the same team https://arxiv.org/abs/2311.09257 (hence the UFO graphic in the diffusion-gan diagram)
3. they train their own 8-channel latent encoder / decoder ("VAE") from scratch (similar to Meta's Emu paper) instead of using the SD VAEs like many other papers do
4. they use an internal dataset of 150m image/text pairs (roughly the size of laion-highres)
5. they also reran SD training from scratch on this dataset to get their baseline performance
Kind of funny that they show the iphone 15 pro and the Samsung S24 in the comparison chart, but not their own phone the google pixel 8. (I know it will perform worse than both phones)
Pixel marketing touts the NPU chip, which sounds ideal for accelerating a model like this. They might have reasons for leaving it out. Perhaps it's planned to ship with a new model - announcing a new phone in a model paper would be weird. I remember them advertising some Pixel-only photo editing features in the past.
It's great in theory until you have to pierce through several organizational boundaries to actually leverage it. Too sclerotic for it to happen regularly at scale.
Not entirely true either. If it thinks it has network but it's flakey, it won't translate offline, it will say there is network error and will give you a button to retry. No button to do offline.
Additionally, in airplane mode it heavily doesn't want to translate, in my use case I have to go to saved translations as otherwise it won't even let me type what I need to translate.
I just tried airplane mode on my pixel 7 pro and it seemed to be able to translate from the camera without problems
It doesn't seem to do it "live" in the preview without network access, though.
And the translation app seemed to get into a bad state and fail to download the language packs without first clearing the data, saying I need to download the pack, but the language list showing it already was. Though I haven't even opened it since I transferred it from my old phone, so if there's some phone-specific stuff going on that might have got messed up.
I don’t see it as a disadvantage, since Google markets services on both devices you mentioned. Hardly anyone will abandon its iPhone in favor of a Pixel just for a Google service.
So I think it’s ok what Google did marketing wise.
Google has fallen so far. Both Inception and Mobilenet were released openly and changed the entire AI world.
Nowadays we just get blog posts about results that were supposedly achieved, an accompanying paper that can’t be reproduced (because of Google’s magical “private datasets”), and some screencaps of a cool application of the tech that is virtually guaranteed to never make it to product.
It did work, I used it myself. A quick search shows others who had my experience. This was late 2019 for me. Here's the first link and the Google post on rolling out in summer 2019:
"Now, you can use it on all Pixel phones in 43 U.S. states.
All it takes is a few seconds to tell your Assistant where you'd like to go. Just ask the Assistant on your phone, “Book a table for four people at [restaurant name] tomorrow night.” The Assistant will then call the restaurant to see if it can accommodate your request. Once your reservation is successfully made, you’ll receive a notification on your phone, an email update and a calendar invite so you don’t forget."
Are like people at Google Research not embarrassed that none of this stuff ever makes it to real life?
Google AI internally needs a huge culture change, stop acting like academics making things for academics and start working like developers making products for customers.
I'd say in 10 years we'll be looking back and seeing the wasted potential but actually you can look back around 10 years and already see the wasted potential of all the things Google demoed or papered and never shipped.
People search and remember things visually. Even if they're not
consciously aware. So on the Cybershow [0] we decided to jump-in and
use AI images as a quick way to visually tag episodes with something
meaningful and fun.
We did that despite some moral ambivalence/uneasiness around AI
"art".
For example, give me a "young and exciting Dana Meadows in front of a
board of systems theory"
I'm not awful at photoshopping things, and sometimes that's the only
way to get a specific image one has in mind. But it saves time and
lets us concentrate on writing and researching instead.
TBH if an artist/illustrator came along and said "Let me do the
episode icons even though you can't pay me yet" I'd feel inclined to
ask the AI to step aside.
Google may very well be first to create AGI but it will be wrapped in so many “safety” layers that it would effectively be lobotomised. Let’s just hope that a Google AGI never gets to watch A clockwork orange.
AI researchers can make any claim, the risk of getting busted is close to none.
Didn't work ?
Well dataset was different
Didn't work ?
Well code was different
Can I try your work ?
Well it's proprietary / I don't have access / We shutdowned the cluster
But the result is guaranteed increase in salary and job opportunities.
Since these companies are publicly listed, they are by definition encouraged and encouraging to make grandiose claims in order to make themselves more attractive to investors, and they can blame the individuals if it becomes discovered.
My favorite being that Bard (PaLM version) is sentient, but it was too big this time.
Imagine a large pharmaceutical company claiming they can cure very important diseases, but the results cannot be independently verified, nor audited.
It’s ok in the short-term, but not when you make that claim during few years.
The point is probably the implication that it'll be pushed to android as a native feature (in the photos app or similar), thus it making sense for investors reading this to put money into google rather than e.g. stability or openai etc. The people writing the article are likely shareholders etc.
So I'm interpreting this that it won't ever get released.