Mind-Blowing AI Image Generator - Give Visual Representation to C's Concepts?

i liked reading these considerations, but could not apprehend these in their full extent.

what still disturbs me in ai is that the ai representations can look like unadulterated photos from reality, but are NOT photos. to me, therefore, ALL ai outputs are fakes which dilute our real world even more into a fake world.

it appears that i have the same apprehension of ai products as painters had when photography was introduced. maybe i am too old to welcome progress, such as i hate mac updates which perturb my habits... or, maybe, i should join the amish...
Hey, don't get me wrong, AI might overall be "a bad thing" for humanity in its current state. But AI isn't so powerful that it can stop the future from being "open", as the C's say. The lessons will take their course, and if we choose to make use of our knowledge, we are exactly where we need to be.

And even if AI overall is bad or goes bad and gets used in bad ways, "the law of exception" is one idea to keep in mind. If we can properly anticipate the dangers and "dance with the universe", there might be hidden possibilities and opportunities to be of service hidden in the cracks along the road! Or not. But in any case, humble knowledge about this strange thing that's emerging can only help us realize it's proper place, I tend to think. And a little lighthearted play along the way probably won't hurt the learning process, if awareness can be mbrought to bear as well.

Probably no need to join the Amish, though if you'd like a fascinating read that mentions them, try "What Technology Wants" by Kevin Kelly. I don't entirely agree with the author's philosophy and conclusions, and his comments about the Amish end up being a little bit of a self-contradiction, like he didn't think about what he was saying deeply enough, but that book gave me exactly what I had hoped to get from it; a few new useful concepts to use as tools when thinking about information, learning, "evolution", and technology.

On the practical level, the way I see it is: the emergence of this technology is part of a powerful current, and we don't know how it's going to twist and turn and change. Trying to stop it won't work, but if one can learn to "surf the wave", there's no telling what kind of butterfly effects we might be able to contribute, so maybe fewer might be pulled under. Who knows?

I'm not decided on all of this by any means. But it's fair to note that I'm part of the millenial generation (and test high in trait "openness"), so I might be more likely to respond with curiosity. Hopefully I'm not too credulous!

I think Jordan Peterson is taking an interesting approach. Based on what he has said elsewhere (I follow him to a fair degree), I think he has been deliberately engaging with AI tech because he considers that he can only determine and exercise his responsibility in relation to it if he understands it. So it seems like he has been "playing" with it, but while trying to maintain a serious awareness of its unknown potential danger. I do have to wonder if he's a little too "gung-ho", but I tend to trust that he's thinking about things from more angles than he can always convey in a given conversation.

This clip is interesting:

And this entire talk is really something else:

It seems to me he's trying to get a handle on things because he sees the emergence of AI as a significant juncture that will tend to magnify our individual and collective orientations. "Be the change you want to see in the world", and AI might magnify that. Or not. But it seems he is "playing with it" with an interesting mixture of detailed seriousness and lighthearted playfulness. I mean, here he is giving talks on the Bible on one side, and training an LLM on the Bible on the other! Who has the gall to do that?! :-P But I think in his case specifically it might be a testament (pardon pun) to a deeper faith that he has in the learning process compared to both dogmatic religious types AND materialist "nihilizers". One can hope. 🤷‍♂️

It's also possible comets come and wipe out all "high technology" and that unknown "holistic technologies" involving sound and stone and mind are the ways of future evolution. BUT, we only weaken ourselves if we make unnecessary assumptions based on 3D thinking and fears. Presumably the path from where we are to the "healthy" future we are trying to envision in this forum will be no less complicated and unexpected than the rest of our lives have been! "Seek ye first the kingdom of Heaven..."

A little abstract in places, but hopefully I've captured my thoughts well enough...
 
i like your detailed assessment, and i simply want to point out that we are at 2 extremes of life: you are young, i am old. the future belongs to you. thanks for the kelly hint. kind regards.
 
okay so this is interesting... this is a kind of covert expansion but mostly in use for the "simplification" of human plasticity and capability to think beyond linear 3D time and space ... This is how AI can better control and "cage" cosmic mind potential within humans and humanoid creatures capable to connect to cosmic knowledge.

 
Interesting! Return of the megaliths and the mother tree? :-P

also, the first image has a faint "artifact" that looks like the reflection of a glasses lens. No doubt such reflections exist in the training data somewhere. "Eyes to see." - is what comes to mind.

In terms of positive uses of this technology, I'm hoping maybe it will help us find new insights into the "concept space" of mankind.

Yesterday I had an interesting thought. When we form healthy attachments to another person, one part of what happens -- which no doubt has a physical-level reflection/aspect in our brain -- is that we receive a sort of internal "duplication" or "imprinting" of their essence, as we see it at least. The voice that stays with us, becomes part of our wisdom. I suspect there is a part of our bodily "machine" brain that is not that different from GPT text-generation AI. A "predictive modeler" of the world. Our "frozen map", which must periodically be updated by the influx of creative/destructive renewal.

It's possible that on the physical level of the brain, part of our "imprint" of another person is akin to a neural-network "model" of them, from our experience. Unconscious, multidimensional, holographic.

It's possible, to some degree, that the GPT text AIs that are coming out, might represent something like that kind of "imprint", but for the entirety of human productions used to train the AI. The same concept could apply to Stable Diffusion image models.

Just a wild thought, and it doesn't imply any assumptions about whether AI is "good" or not. An imprint can be positive, or negative, or both/neither.

Thought that was an interesting thought. 🤔

[Edit: corrected spacing.]
 
As I understood it, it uses existing images from around the web. And then combine them in new images (thats maybe where the glasses lense comes from). I start with one sentence and then see what it makes of it. Then another, and another . . . and some "style" on the end. Its interesting that it becomes better in each try, just as it learns in the process.
 
As I understood it, it uses existing images from around the web. And then combine them in new images (thats maybe where the glasses lense comes from). I start with one sentence and then see what it makes of it. Then another, and another . . . and some "style" on the end. Its interesting that it becomes better in each try, just as it learns in the process.
I've read a decent amount about the tech, at least for a layperson, so I can help clarify some things about how it works.

Basically, behind the scenes, before it shows your picture, it starts from a random pixel noise image, just color pixel noise. And it "denoises" that picture to try and create a picture that seems "plausible", guided by the text, i.e. to the best of its ability it recreates the mathematical information "patterns" that it "learned" from images in its training database.

It doesn't really combine images from the internet. It's more like it combines the concepts it learned from them, the concepts it learned to connect the training image tag text with the images themselves.

You can run of these AIs on a decent graphics-oriented home computer with no internet connection, and the AI "model" file is only about 4Gb, sometimes 2Gb. However this model was trained on about 2.3 Billion image/text pairs, each image being a 512x512 .png.

Sadly, I wasn't able to find data on the actual total file size of that training database, but if I guess that the average 512x512 .png file is ~500 Kilobytes, then multiply that by 2,300,000,000 (2.3 billion), then the figure I get is 1,150,000 Gigabytes.

So the AI file is ~267,441 times smaller than the original database. There's not enough room in there to store the actual original images, they've been reduced to some kind of lossy "holographic"/conceptual representation, or something sort of like that.

Fun fact! The 2Gb models are just compressed versions of the 4Gb ones, but they only perform about 10% worse. some people run the 2Gb ones because they can be run on a graphics card with less memory.


Also, with reference to your last sentence about "learning". Some models perform better with longer prompts, some perform better with shorter ones. Most give better results the more specific you get. The AI model itself is what is called a "frozen" network. Unless someone's pulling off a truly incredible lie, these networks don't "learn" while you use them, they just react in complex ways to the text you give them (longer being better as then it has more direction). There's a very active open source community working with and training these models, so if the AI files were changing while people were using them, it would be noticed pretty fast. But the model files don't change, they keep the exact same "hash" code before and after use, which means that 0 bits in the AI file have changed.

That doesn't rule out wild hyperdimensional soul imprint stuff going on, I just want to hopefully clear some confusion, since there is a lot of mythology flying around about how these AIs work, leading to a lot of magical thinking in the general public. And this is being taken advantage of by the "anti-AI" folk, whom I think are partly being leveraged by corps who want to become the sole arbiters of AI's legal use.

[Edit: spacing error]
 
I've read a decent amount about the tech, at least for a layperson, so I can help clarify some things about how it works.

Basically, behind the scenes, before it shows your picture, it starts from a random pixel noise image, just color pixel noise. And it "denoises" that picture to try and create a picture that seems "plausible", guided by the text, i.e. to the best of its ability it recreates the mathematical information "patterns" that it "learned" from images in its training database.

It doesn't really combine images from the internet. It's more like it combines the concepts it learned from them, the concepts it learned to connect the training image tag text with the images themselves.

You can run of these AIs on a decent graphics-oriented home computer with no internet connection, and the AI "model" file is only about 4Gb, sometimes 2Gb. However this model was trained on about 2.3 Billion image/text pairs, each image being a 512x512 .png.

Sadly, I wasn't able to find data on the actual total file size of that training database, but if I guess that the average 512x512 .png file is ~500 Kilobytes, then multiply that by 2,300,000,000 (2.3 billion), then the figure I get is 1,150,000 Gigabytes.

So the AI file is ~267,441 times smaller than the original database. There's not enough room in there to store the actual original images, they've been reduced to some kind of lossy "holographic"/conceptual representation, or something sort of like that.

Fun fact! The 2Gb models are just compressed versions of the 4Gb ones, but they only perform about 10% worse. some people run the 2Gb ones because they can be run on a graphics card with less memory.


Also, with reference to your last sentence about "learning". Some models perform better with longer prompts, some perform better with shorter ones. Most give better results the more specific you get. The AI model itself is what is called a "frozen" network. Unless someone's pulling off a truly incredible lie, these networks don't "learn" while you use them, they just react in complex ways to the text you give them (longer being better as then it has more direction). There's a very active open source community working with and training these models, so if the AI files were changing while people were using them, it would be noticed pretty fast. But the model files don't change, they keep the exact same "hash" code before and after use, which means that 0 bits in the AI file have changed.

That doesn't rule out wild hyperdimensional soul imprint stuff going on, I just want to hopefully clear some confusion, since there is a lot of mythology flying around about how these AIs work, leading to a lot of magical thinking in the general public. And this is being taken advantage of by the "anti-AI" folk, whom I think are partly being leveraged by corps who want to become the sole arbiters of AI's legal use.

[Edit: spacing error]
Thanks for this great clarification. But it makes it alltogether more interesting. I could swear that images are looking more better, precise, more true in each new "generate", with the same prompt.

On the other side it is really not that inteligent. It can understand and use only a couple of concepts per image. For example for: "Bush of red roses in a old terracotta pot under the old staircase with the iron railing". You get very nice picture, but if you add: "ceramic jug full of coffee in front of bush of red roses in a old terracotta pot under the old staircase with the iron railing" you get really silly mixed image.
 
Yes. I think part of the reason is that the neural network is a "feed-forward" type. That is, to my knowledge there is little or no recursion in its network, information does not move from later steps back to an "earlier" point in the network to further refine or "re-think" parts of the image. I think that as a consequence it has a "bandwidth" to how many concepts it can juggle at a time. Give it more, or concepts it doesn't quite have well-separated in its "mind", and it tends to fuse them, or use them to some degree more like implicit/unconscious influences rather than something to represent concretely.

I remember, once someone gave one of these modern image generators -- I forget if it was SD specifically -- a prompt along the lines of "salmon in a river", and it generated multiple images of salmon steaks and cutlets floating through the water. LOL!
 
Back
Top Bottom