Artificial Intelligence News & Discussion

I also feel unmotivated due to AI's ability to come up with diagnoses just as good as doctors.
Solely based on a patient's words, AI can figure the disease out just.
AI may be great for detecting diseases and helping with diagnoses, but it is not always right! AI confabulates and that is where it’s still crucial for you to have that knowledge base. POB posted a great article about this on the Does ChatGPT know how to lie? thread. Sure AI can help you refine your diagnoses and help with differentials, but it is still only a tool and not a replacement. As Gaby so eloquently wrote above, it is no replacement for human connection, empathy and care. That’s your role as a physician.

And speaking of McGilchrist, he wrote an excellent article titled Raging Against the Machine, some of which I quote below (emphases mine):
McGilchrist said:
He tells me that he will say that “the opposite of life is not death, the opposite of life is mechanism … We are embracing the idea that we are machine-like … in the process, we’re losing our sense of wonder, we’re losing our sense of humility, we’re becoming hubristic.” And this, he reminds me, has been “from time immemorial in all the cultures of the world the fable, the myth of how we will destroy ourselves, through hubris, as Lucifer became Satan”. […]

He argues that another potential source of society’s self-destruction is individualism, which he links to secularisation, high levels of unhappiness and the ecological crisis. “Almost everything we’re taught … effectively, it’s about me, me, me … and we now find that we are the most miserable people that have ever lived. Surprise, surprise, we’ve cut ourselves off from the roots of fulfilment, which is oneness with nature, with the divine and with one another.” Solving ecological ills, he says, will follow when people see themselves as caretakers, not exploiters, of the natural world.
 
I watched last Jensen Huang conference highlights, the boss of Nvidia. The important point is they announce a x40 efficiency for the next generation of their cpu dedicated to run AI in the cloud.

1743884505986.png

To achieve this, they manage to build cpu with integrated electric current to light converters:

1743884662436.png

So to the cost/efficiency/place needed to interconnect a bunch of cpu is greatly lowered. No more external converters.

With all their money they really advance very quickly. I think Nvidia being kind of a mature lady, help too for scaling up their organisation. They have gone through rough time by the past.

The end of the conference was about robotic. The audience was mostly quiet I found. I guess everyone is asking himself where we are really going.

The video.
 
I would say that this is quite impressive for a phone chip. It can run DeepSeek R1 8B model.

 
Small Language Models Are the New Rage, Researchers Say

Large language models work well because they’re so large. The latest models from OpenAI, Meta, and DeepSeek use hundreds of billions of “parameters”—the adjustable knobs that determine connections among data and get tweaked during the training process. With more parameters, the models are better able to identify patterns and connections, which in turn makes them more powerful and accurate.

But this power comes at a cost. Training a model with hundreds of billions of parameters takes huge computational resources. To train its Gemini 1.0 Ultra model, for example, Google reportedly spent $191 million. Large language models (LLMs) also require considerable computational power each time they answer a request, which makes them notorious energy hogs. A single query to ChatGPT consumes about 10 times as much energy as a single Google search, according to the Electric Power Research Institute.

In response, some researchers are now thinking small. IBM, Google, Microsoft, and OpenAI have all recently released small language models (SLMs) that use a few billion parameters—a fraction of their LLM counterparts.

Small models are not used as general-purpose tools like their larger cousins. But they can excel on specific, more narrowly defined tasks, such as summarizing conversations, answering patient questions as a health care chatbot, and gathering data in smart devices. “For a lot of tasks, an 8 billion–parameter model is actually pretty good,” said Zico Kolter, a computer scientist at Carnegie Mellon University. They can also run on a laptop or cell phone, instead of a huge data center. (There’s no consensus on the exact definition of “small,” but the new models all max out around 10 billion parameters.)

To optimize the training process for these small models, researchers use a few tricks. Large models often scrape raw training data from the internet, and this data can be disorganized, messy, and hard to process. But these large models can then generate a high-quality data set that can be used to train a small model. The approach, called knowledge distillation, gets the larger model to effectively pass on its training, like a teacher giving lessons to a student. “The reason [SLMs] get so good with such small models and such little data is that they use high-quality data instead of the messy stuff,” Kolter said.

Researchers have also explored ways to create small models by starting with large ones and trimming them down. One method, known as pruning, entails removing unnecessary or inefficient parts of a neural network—the sprawling web of connected data points that underlies a large model.

Pruning was inspired by a real-life neural network, the human brain, which gains efficiency by snipping connections between synapses as a person ages. Today’s pruning approaches trace back to a 1989 paper in which the computer scientist Yann LeCun, now at Meta, argued that up to 90 percent of the parameters in a trained neural network could be removed without sacrificing efficiency. He called the method “optimal brain damage.” Pruning can help researchers fine-tune a small language model for a particular task or environment.

For researchers interested in how language models do the things they do, smaller models offer an inexpensive way to test novel ideas. And because they have fewer parameters than large models, their reasoning might be more transparent. “If you want to make a new model, you need to try things,” said Leshem Choshen, a research scientist at the MIT-IBM Watson AI Lab. “Small models allow researchers to experiment with lower stakes.”

The big, expensive models, with their ever-increasing parameters, will remain useful for applications like generalized chatbots, image generators, and drug discovery. But for many users, a small, targeted model will work just as well, while being easier for researchers to train and build. “These efficient models can save money, time, and compute,” Choshen said.


LLMs No Longer Require Powerful Servers: Researchers from MIT, KAUST, ISTA, and Yandex Introduce a New AI Approach to Rapidly Compress Large Language Models without a Significant Loss of Quality

Previously, deploying large language models on mobile devices or laptops involved a quantization process — taking anywhere from hours to weeks and it had to be run on industrial servers — to maintain good quality. Now, quantization can be completed in a matter of minutes right on a smartphone or laptop without industry-grade hardware or powerful GPUs.

HIGGS lowers the barrier to entry for testing and deploying new models on consumer-grade devices, like home PCs and smartphones by removing the need for industrial computing power.

The innovative compression method furthers the company’s commitment to making large language models accessible to everyone, from major players, SMBs, and non-profit organizations to individual contributors, developers, and researchers. Last year, Yandex researchers collaborated with major science and technology universities to introduce two novel LLM compression methods: Additive Quantization of Large Language Models (AQLM) and PV-Tuning. Combined, these methods can reduce model size by up to 8 times while maintaining 95% response quality.

Breaking Down LLM Adoption Barriers

Large language models require substantial computational resources, which makes them inaccessible and cost-prohibitive for most. This is also the case for open-source models, like the popular DeepSeek R1, which can’t be easily deployed on even the most advanced servers designed for model training and other machine learning tasks.

As a result, access to these powerful models has traditionally been limited to a select few organizations with the necessary infrastructure and computing power, despite their public availability.

However, HIGGS can pave the way for broader accessibility. Developers can now reduce model size without sacrificing quality and run them on more affordable devices. For example, this method can be used to compress LLMs like DeepSeek R1 with 671B parameters and Llama 4 Maverick with 400B parameters, which previously could only be quantized (compressed) with a significant loss in quality. This quantization technique unlocks new ways to use LLMs across various fields, especially in resource-constrained environments. Now, startups and independent developers can leverage compressed models to build innovative products and services, while cutting costs on expensive equipment.

Yandex is already using HIGGS to prototype and accelerate product development, and idea testing, as compressed models enable faster testing than their full-scale counterparts.

About the Method

HIGGS (Hadamard Incoherence with Gaussian MSE-optimal GridS) compresses large language models without requiring additional data or gradient descent methods, making quantization more accessible and efficient for a wide range of applications and devices. This is particularly valuable when there’s a lack of suitable data for calibrating the model. The method offers a balance between model quality, size, and quantization complexity, making it possible to use the models on a wide range of devices like smartphones and consumer laptops.

HIGGS was tested on the LLaMA 3.1 and 3.2-family models, as well as on Qwen-family models. Experiments show that HIGGS outperforms other data-free quantization methods, including NF4 (4-bit NormalFloat) and HQQ (Half-Quadratic Quantization), in terms of quality-to-size ratio.

Developers and researchers can already access the method on Hugging Face or explore the research paper, which is available on arXiv. At the end of this month, the team will present their paper at NAACL, one of the world’s top conferences on AI.

Continuous Commitment to Advancing Science and Optimization

This is one of several papers Yandex Research presented on large language model quantization. For example, the team presented AQLM and PV-Tuning, two methods of LLM compression that can reduce a company’s computational budget by up to 8 times without significant loss in AI response quality. The team also built a service that lets users run an 8B model on a regular PC or smartphone via a browser-based interface, even without high computing power.

Beyond LLM quantization, Yandex has open-sourced several tools that optimize resources used in LLM training. For example, the YaFSDP library accelerates LLM training by as much as 25% and reduces GPU resources for training by up to 20%.

Earlier this year, Yandex developers open-sourced Perforator, a tool for continuous real-time monitoring and analysis of servers and apps. Perforator highlights code inefficiencies and provides actionable insights, which helps companies reduce infrastructure costs by up to 20%. This could translate to potential savings in millions or even billions of dollars per year, depending on company size.

 
I am playing around today using AI to code a quiz page - You Might Be a Hillbilly - Find Out! - Just a reminder that AI is not our boss yet anyway. :lol:

Geez, I don"t know how old you are, or if you know about it, back in the 90"s Jeff Foxworthy carved out a pretty good amount of fame and fortune with his book and stand up routine called "You might be a Redneck".
Just wanted to mention it, in case, you, or ol'AI might be treading on copyright territory there.:lol:
Here's a short, hilarious 6 minutes from one of his stage appearances;
 
Geez, I don"t know how old you are, or if you know about it, back in the 90"s Jeff Foxworthy carved out a pretty good amount of fame and fortune with his book and stand up routine called "You might be a Redneck".
Just wanted to mention it, in case, you, or ol'AI might be treading on copyright territory there.:lol:
Here's a short, hilarious 6 minutes from one of his stage appearances;
Not too worried about that. Not selling anything and I can take it down as fast as I put it up.
 
Back
Top Bottom