Artificial Intelligence News & Discussion

I use Whisper regularly and I never had such a problem. But the AI model was not trained especially for the the medical vocabulary so I'm not surprised. This is using a tool over it capacity. The people who did that are not very smart.
The problem with all these generative models is that they never know when they're unsure of something. This includes audio transcription generative models like Whisper. If you say something incomprehensible it will often make something up that it "thinks" you said. Humans do it to to some degree when they misheard something. But people will often say "this made no sense, or I don't know what I heard, or can you spell that for me and explain what this term is" while AI will just roll with a best guess no matter how wildly incorrect. Kids often do stuff like this, and certainly many adults as well. But not all, and what makes some people question things while others just accept things at face value isn't fully explained by science. Could it be an IQ thing? Could it be past life experiences and lessons? Why are some people so naive and make stuff up while others aren't?

Right now their entire bet is on one thing - bigger AI (more parameters), and feed it more data. They're hoping that by giving it a bigger brain and exposing it to more data, some critical threshold will be reached and things like wisdom and common sense will suddenly show up. And that idea isn't entirely faulty because it is supported by something called "emergent" abilities - that by scaling up the models, the data, and the compute, they found that new abilities that weren't explicitly trained into the models and didn't exist in smaller models suddenly appeared. There are some research papers on this topic:




And supposedly you can't predict when (at what models scale etc) any given ability will emerge, and the degree of that ability when it does emerge. But not everyone agrees that this is a real thing either - some suppose that the abilities only *appear* to emerge due to the limitation of testing for those abilities not being sensitive to pick up progress towards the ability - kinda like in a test when a teacher doesn't ever give partial credit even if the answer is 99% correct, and only scores it as correct when it's 100%. Then the test result won't differentiate between students who didn't get any part of the answer correct and those who got it almost right.

But can anything and everything "emerge" in such a way? Probably not, but it will be interesting to see where all this goes.
 
Yeah, once, by accident, I imputed Italian in Whisper using the French option. And it outputted something perfectly intelligible to my amazement. Not sure the translation was perfectly correct but it was kind of. The two languages are not so far from each other so I guess it played a role for sure. Anyway the AI did not stopped.

About bigger models, today I was reading an article about the increasing "learning power" needed by each new AI generation to have less errors than the previous training. It's exponential and the curve perfectly feet the prediction.

There was an interesting comment about this problem by a physicist:

I'm a physicist. This is just the intersection of linear algebra and information encoding. LLM's = data compression & lookup. But they do it all via maps 100% isomorphic to basic linear algebra... pretty dumb stuff. No real learning just a trivial map of concepts expressed in language (your brain uses highly nonlinear chemical and feedback systems in its encoding schemes it's possible the microtubules may even make use of qm processes). For non-trivial applications of statmech to encoding problems see the work on genetic codes by T. Tlusti (on arxiv).

The language encodes non trivial non linear concepts in syntax + vocabulary the approximate linear fit to the system is not this. This scaling is trivial and not anything special or meaningful ie they are not using their entropy calculations in an intelligent way here. It's actually very sad what's going on. From a physical perspective AI is 100% hype and 300% BS... well into not even wrong territory, just weaponized over fitting in excessively large dimensional models.

This is frustrating because they could use this compute to really learn something quantitative about our species use of language and how it works with our biology... instead they use vast resources to learn nothing... instead creating a system that randomly chooses a context-thread through a train of thought without any internal mapping to an underlying narrative involving the evolution of a narrative of sequenced thought. In short they squash the time axis into a flat map of ideas or concepts (like a choose your own adventure with a random data dependent map providing interpolation between various choices... a fixed percentage of which will be "hallucinations" because unlike real error correcting codes or naturally occurring encoding schemes this has no inbuilt dynamic feedback mechanism for error detection and correction).

The structure of our language and the meaning it represents allows you to formulate absurd sentences and concepts so we don't "hallucinate" unless we want to... we even tolerate absurdity as a meaningful sub class of encodings i.e. humorous language. The way the neural networks are trained precludes any of this reflexive or error correcting representations as it's complexity would necessarily grow exponentially with the data set. We cheat because we have hardwired physical laws into the operation of our neural networks that serve as a calibration and objective precise maps for ground truth (your brain can learn to throw and catch without thinking or learn to solve predictive anticipatory Lagrangian dynamics on the fly: aka energy management in a dog fight even defeating systems that follow optimum control laws and operate with superior initial energy states aka guided missiles) . You can even train systems like llms (ie deep learning) to solve some pretty hard equations on specific domains but the mathematics places hard limits on the error of these maps (like an abstract impedance mismatch, but worse)... you can even use this to make reasonable control laws for systems that satisfy specific stability constraints... but lyapunov will always win in the end. This isn't the case of trying to map SU(2) onto SO(3). It's like trying to map the plane onto a sphere without explicitly handling the topological defect and saying you don't really care about it anyway. With this approach you're gonna end up dealing with unpredictable errors at all orders and have no way of estimating them a priori... unfortunately enthusiasm and resources exceeds the education in both physcis and math for these efforts. The guys doing this stuff simply don't know what they don't know... but they should. The universities are failing our students.
 
Back
Top Bottom