Autor Tema: STEM (Leído 347402 veces)

Cadavre Exquis · « **Respuesta #570 en:** Abril 01, 2025, 22:10:00 pm »

https://x.com/OperadorNuclear/status/1906652617738256389

Saludos.

Cadavre Exquis · « **Respuesta #571 en:** Abril 06, 2025, 22:32:40 pm »

Citar

The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation

Meta · 2025.04.05

As more people continue to use artificial intelligence to enhance their daily lives, it’s important that the leading models and systems are openly available so everyone can build the future of personalized experiences. Today, we’re excited to announce the most advanced suite of models that support the entire Llama ecosystem. We’re introducing Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with unprecedented context length support and our first built using a mixture-of-experts (MoE) architecture. We’re also previewing Llama 4 Behemoth, one of the smartest LLMs in the world and our most powerful yet to serve as a teacher for our new models.

These Llama 4 models mark the beginning of a new era for the Llama ecosystem. We designed two efficient models in the Llama 4 series, Llama 4 Scout, a 17 billion active parameter model with 16 experts, and Llama 4 Maverick, a 17 billion active parameter model with 128 experts. The former fits on a single H100 GPU (with Int4 quantization) while the latter fits on a single H100 host. We also trained a teacher model, Llama 4 Behemoth, that outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks such as MATH-500 and GPQA Diamond. While we’re not yet releasing Llama 4 Behemoth as it is still training, we’re excited to share more technical details about our approach.

We continue to believe that openness drives innovation and is good for developers, good for Meta, and good for the world. We’re making Llama 4 Scout and Llama 4 Maverick available for download today on llama.com and Hugging Face so everyone can continue to build new experiences using our latest technology. We’ll also make them available via our partners in the coming days. You can also try Meta AI with Llama 4 starting today in WhatsApp, Messenger, Instagram Direct, and on the Meta.AI website.

This is just the beginning for the Llama 4 collection. We believe that the most intelligent systems need to be capable of taking generalized actions, conversing naturally with humans, and working through challenging problems they haven’t seen before. Giving Llama superpowers in these areas will lead to better products for people on our platforms and more opportunities for developers to innovate on the next big consumer and business use cases. We’re continuing to research and prototype both models and products, and we’ll share more about our vision at LlamaCon on April 29—sign up to hear more.

Whether you’re a developer building on top of our models, an enterprise integrating them into your workflows, or simply curious about the potential uses and benefits of AI, Llama 4 Scout and Llama 4 Maverick are the best choices for adding next-generation intelligence to your products. Today, we’re excited to share more about the four major parts of their development and insights into our research and design process. We also can’t wait to see the incredible new experiences the community builds with our new Llama 4 models.

Pre-training

These models represent the best of Llama, offering multimodal intelligence at a compelling price while outperforming models of significantly larger sizes. Building the next generation of Llama models required us to take several new approaches during pre-training.

Our new Llama 4 models are our first models that use a mixture of experts (MoE) architecture. In MoE models, a single token activates only a fraction of the total parameters. MoE architectures are more compute efficient for training and inference and, given a fixed training FLOPs budget, delivers higher quality compared to a dense model.

As an example, Llama 4 Maverick models have 17B active parameters and 400B total parameters. We use alternating dense and mixture-of-experts (MoE) layers for inference efficiency. MoE layers use 128 routed experts and a shared expert. Each token is sent to the shared expert and also to one of the 128 routed experts. As a result, while all parameters are stored in memory, only a subset of the total parameters are activated while serving these models. This improves inference efficiency by lowering model serving costs and latency—Llama 4 Maverick can be run on a single NVIDIA H100 DGX host for easy deployment, or with distributed inference for maximum efficiency.

Llama 4 models are designed with native multimodality, incorporating early fusion to seamlessly integrate text and vision tokens into a unified model backbone. Early fusion is a major step forward, since it enables us to jointly pre-train the model with large amounts of unlabeled text, image, and video data. We also improved the vision encoder in Llama 4. This is based on MetaCLIP but trained separately in conjunction with a frozen Llama model to better adapt the encoder to the LLM.

We developed a new training technique which we refer to as MetaP that allows us to reliably set critical model hyper-parameters such as per-layer learning rates and initialization scales. We found that chosen hyper-parameters transfer well across different values of batch size, model width, depth, and training tokens. Llama 4 enables open source fine-tuning efforts by pre-training on 200 languages, including over 100 with over 1 billion tokens each, and overall 10x more multilingual tokens than Llama 3.

Additionally, we focus on efficient model training by using FP8 precision, without sacrificing quality and ensuring high model FLOPs utilization—while pre-training our Llama 4 Behemoth model using FP8 and 32K GPUs, we achieved 390 TFLOPs/GPU. The overall data mixture for training consisted of more than 30 trillion tokens, which is more than double the Llama 3 pre-training mixture and includes diverse text, image, and video datasets.

We continued training the model in what we call “mid-training” to improve core capabilities with new training recipes including long context extension using specialized datasets. This enabled us to enhance model quality while also unlocking best-in-class 10M input context length for Llama 4 Scout.

Post-training our new models

Our newest models include smaller and larger options to accommodate a range of use cases and developer needs. Llama 4 Maverick offers unparalleled, industry-leading performance in image and text understanding, enabling the creation of sophisticated AI applications that bridge language barriers. As our product workhorse model for general assistant and chat use cases, Llama 4 Maverick is great for precise image understanding and creative writing.

The biggest challenge while post-training the Llama 4 Maverick model was maintaining a balance between multiple input modalities, reasoning, and conversational abilities. For mixing modalities, we came up with a carefully curated curriculum strategy that does not trade-off performance compared to the individual modality expert models. With Llama 4, we revamped our post-training pipeline by adopting a different approach: lightweight supervised fine-tuning (SFT) > online reinforcement learning (RL) > lightweight direct preference optimization (DPO). A key learning was that SFT and DPO can over-constrain the model, restricting exploration during the online RL stage and leading to suboptimal accuracy, particularly in reasoning, coding, and math domains. To address this, we removed more than 50% of our data tagged as easy by using Llama models as a judge and did lightweight SFT on the remaining harder set. In the subsequent multimodal online RL stage, by carefully selecting harder prompts, we were able to achieve a step change in performance. Furthermore, we implemented a continuous online RL strategy, where we alternated between training the model and then using it to continually filter and retain only medium-to-hard difficulty prompts. This strategy proved highly beneficial in terms of compute and accuracy tradeoffs. We then did a lightweight DPO to handle corner cases related to model response quality, effectively achieving a good balance between the model’s intelligence and conversational abilities. Both the pipeline architecture and the continuous online RL strategy with adaptive data filtering culminated in an industry-leading, general-purpose chat model with state-of-the-art intelligence and image understanding capabilities.

As a general purpose LLM, Llama 4 Maverick contains 17 billion active parameters, 128 experts, and 400 billion total parameters, offering high quality at a lower price compared to Llama 3.3 70B. Llama 4 Maverick is the best-in-class multimodal model, exceeding comparable models like GPT-4o and Gemini 2.0 on coding, reasoning, multilingual, long-context, and image benchmarks, and it’s competitive with the much larger DeepSeek v3.1 on coding and reasoning.

Our smaller model, Llama 4 Scout, is a general purpose model with 17 billion active parameters, 16 experts, and 109 billion total parameters that delivers state-of-the-art performance for its class. Llama 4 Scout dramatically increases the supported context length from 128K in Llama 3 to an industry leading 10 million tokens. This opens up a world of possibilities, including multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.

Llama 4 Scout is both pre-trained and post-trained with a 256K context length, which empowers the base model with advanced length generalization capability. We present compelling results in tasks such as retrieval with “retrieval needle in haystack” for text as well as cumulative negative log-likelihoods (NLLs) over 10 million tokens of code. A key innovation in the Llama 4 architecture is the use of interleaved attention layers without positional embeddings. Additionally, we employ inference time temperature scaling of attention to enhance length generalization. We call this the iRoPE architecture, where “i” stands for “interleaved” attention layers, highlighting the long-term goal of supporting “infinite” context length, and “RoPE” refers to the rotary position embeddings employed in most layers.

We trained both of our models on a wide variety of image and video frame stills in order to give them broad visual understanding, including of temporal activities and related images. This enables effortless interaction on multi-image inputs alongside text prompts for visual reasoning and understanding tasks. The models were pre-trained on up to 48 images, and we’ve tested in post-training with good results up to eight images.

Llama 4 Scout is also best-in-class on image grounding, able to align user prompts with relevant visual concepts and anchor model responses to regions in the image. This enables more precise visual question answering for the LLM to better understand user intent and localize objects of interest. Llama 4 Scout also exceeds comparable models on coding, reasoning, long context, and image benchmarks and offers stronger performance than all previous Llama models.

These new models are important building blocks that will help enable the future of human connection. In keeping with our commitment to open source, we’re making Llama 4 Maverick and Llama 4 Scout available to download on llama.com and Hugging Face, with availability across the most widely used cloud and data platforms, edge silicon, and global service integrators to follow shortly.

Pushing Llama to new sizes: The 2T Behemoth

We’re excited to share a preview of Llama 4 Behemoth, a teacher model that demonstrates advanced intelligence among models in its class. Llama 4 Behemoth is also a multimodal mixture-of-experts model, with 288B active parameters, 16 experts, and nearly two trillion total parameters. Offering state-of-the-art performance for non-reasoning models on math, multilinguality, and image benchmarks, it was the perfect choice to teach the smaller Llama 4 models. We codistilled the Llama 4 Maverick model from Llama 4 Behemoth as a teacher model, resulting in substantial quality improvements across end task evaluation metrics. We developed a novel distillation loss function that dynamically weights the soft and hard targets through training. Codistillation from Llama 4 Behemoth during pre-training amortizes the computational cost of resource-intensive forward passes needed to compute the targets for distillation for the majority of the training data used in student training. For additional new data incorporated in student training, we ran forward passes on the Behemoth model to create distillation targets.

Post-training a model with two trillion parameters was a significant challenge too that required us to completely overhaul and revamp the recipe, starting from the scale of data. In order to maximize performance, we had to prune 95% of the SFT data, as opposed to 50% for smaller models, to achieve the necessary focus on quality and efficiency. We also found that doing lightweight SFT followed by large-scale reinforcement learning (RL) produced even more significant improvements in reasoning and coding abilities of the model. Our RL recipe focused on sampling hard prompts by doing pass@k analysis with the policy model and crafting a training curriculum of increasing prompt hardness. We also found that dynamically filtering out prompts with zero advantage during training and constructing training batches with mixed prompts from multiple capabilities were instrumental in providing a performance boost on math, reasoning, and coding. Finally, sampling from a variety of system instructions was crucial in ensuring that the model retained its instruction following ability for reasoning and coding and was able to perform well across a variety of tasks.

Scaling RL for a two trillion parameter model also required revamping our underlying RL infrastructure due to its unprecedented scale. We optimized the design of our MoE parallelization for speed, which enabled faster iteration. We developed a fully asynchronous online RL training framework that enhanced flexibility. Compared to the existing distributed training framework, which sacrifices the compute memory in order to stack all models in memory, our new infrastructure enabled flexible allocation of different models to separate GPUs, balancing resources across multiple models based on computational speed. This innovation resulted in a ~10x improvement in training efficiency over previous generations.

Safeguards and protections

We aim to develop the most helpful and useful models while protecting against and mitigating the most severe risks. We built Llama 4 with the best practices outlined in our Developer Use Guide: AI Protections. This includes integrating mitigations at each layer of model development from pre-training to post-training to tunable system-level mitigations that shield developers from adversarial users. In doing so, we empower developers to create helpful, safe, and adaptable experiences for their Llama-supported applications.

Pre- and post-training mitigations

For pre-training, we use data filtering in combination with other data mitigations to safeguard models. For post-training, we apply a range of techniques to ensure our models conform to policies that are helpful to users and developers, including the right level of safety data at each stage.

System-level approaches

At the system-level, we have open-sourced several safeguards which can help identify and guard against potentially harmful inputs and outputs. These tools can be integrated into our Llama models and with other third-party tools:
Llama Guard: Our input/output safety large language model based on the hazards taxonomy we developed with MLCommons. Developers can use it to detect whether inputs or outputs violate the policies they’ve created for their specific application.
Prompt Guard: A classifier model trained on a large corpus of attacks, which is capable of detecting both explicitly malicious prompts (Jailbreaks) as well as prompts that contain inject inputs (Prompt Injections).
CyberSecEval: Evaluations that help AI model and product developers understand and reduce generative AI cybersecurity risk.
We’ve heard from developers that these tools are most effective and helpful when they can be tailored to their applications. We provide developers with an open solution so they can create the safest and most effective experiences based on their needs. We’ll also continue working with a global set of partners to create industry-wide system standards that benefit the open source community.

Evaluations and red-teaming

We run systematic testing of models across a wide range of scenarios and use cases in a controlled and repeatable manner. This produces data that we incorporate back into post-training.

We stress test our models using adversarial dynamic probing across a range of topics using automated and manual testing. We’ve made advancements in understanding and evaluating potential model risk. One example of this is our new development of Generative Offensive Agent Testing (GOAT). Using GOAT, we address the limitations of traditional red-teaming by simulating multi-turn interactions of medium-skilled adversarial actors, helping us increase our testing coverage and raise vulnerabilities faster. By adding automation to our testing toolkit, GOAT has allowed our expert human red teamers to focus on more novel adversarial areas, while the automation focuses on known risk areas. This makes the process more efficient and effective, and it enables us to build a better quantitative and qualitative picture of risk.

Addressing bias in LLMs

It’s well-known that all leading LLMs have had issues with bias—specifically, they historically have leaned left when it comes to debated political and social topics. This is due to the types of training data available on the internet.

Our goal is to remove bias from our AI models and to make sure that Llama can understand and articulate both sides of a contentious issue. As part of this work, we’re continuing to make Llama more responsive so that it answers questions, can respond to a variety of different viewpoints without passing judgment, and doesn't favor some views over others.

We have made improvements on these efforts with this release—Llama 4 performs significantly better than Llama 3 and is comparable to Grok:
Llama 4 refuses less on debated political and social topics overall (from 7% in Llama 3.3 to below 2%).
Llama 4 is dramatically more balanced with which prompts it refuses to respond to (the proportion of unequal response refusals is now less than 1% on a set of debated topical questions).
Our testing shows that Llama 4 responds with strong political lean at a rate comparable to Grok (and at half of the rate of Llama 3.3) on a contentious set of political or social topics. While we are making progress, we know we have more work to do and will continue to drive this rate further down.
We’re proud of this progress to date and remain committed to our goal of eliminating overall bias in our models.
Explore the Llama ecosystem

While it’s important that models are intelligent, people also want models that can reply in a personalized way with human-like speed. As our most advanced models yet, Llama 4 is optimized to meet these needs.

Of course, models are one piece of the larger ecosystem that brings these experiences to life. We’re focused on the full stack, which includes new product integrations. We’re excited to continue the conversations we’re having with our partners and the open source community, and as always, we can’t wait to see the rich experiences people build in the new Llama ecosystem.

Download the Llama 4 Scout and Llama 4 Maverick models today on llama.com and Hugging Face. Try Meta AI built with Llama 4 in WhatsApp, Messenger, Instagram Direct, and on the Meta.AI website.

This work was supported by our partners across the AI community. We’d like to thank and acknowledge (in alphabetical order): Accenture, Amazon Web Services, AMD, Arm, CentML, Cerebras, Cloudflare, Databricks, Deepinfra, DeepLearning.AI, Dell, Deloitte, Fireworks AI, Google Cloud, Groq, Hugging Face, IBM Watsonx, Infosys, Intel, Kaggle, Mediatek, Microsoft Azure, Nebius, NVIDIA, ollama, Oracle Cloud, PwC, Qualcomm, Red Hat, SambaNova, Sarvam AI, Scale AI, Scaleway, Snowflake, TensorWave, Together AI, vLLM, Wipro.

Saludos.

Cadavre Exquis · « **Respuesta #572 en:** Abril 19, 2025, 10:20:04 am »

Citar

China Develops Flash Memory 10,000x Faster With 400-Picosecond Speed
Posted by BeauHD on Saturday April 19, 2025 @03:00AM from the new-and-improved dept.

Longtime Slashdot reader hackingbear shares a report from Interesting Engineering:
Citar
A research team at Fudan University in Shanghai, China has built the fastest semiconductor storage device ever reported, a nonvolatile flash memory dubbed "PoX" that programs a single bit in 400 picoseconds (0.0000000004 s) -- roughly 25 billion operations per second. Conventional static and dynamic RAM (SRAM, DRAM) write data in 1-10 nanoseconds but lose everything when power is cut while current flash chips typically need micro to milliseconds per write -- far too slow for modern AI accelerators that shunt terabytes of parameters in real time.

The Fudan group, led by Prof. Zhou Peng at the State Key Laboratory of Integrated Chips and Systems, re-engineered flash physics by replacing silicon channels with two dimensional Dirac graphene and exploiting its ballistic charge transport. Combining ultralow energy with picosecond write speeds could eliminate separate highspeed SRAM caches and remove the longstanding memory bottleneck in AI inference and training hardware, where data shuttling, not arithmetic, now dominates power budgets. The team [which is now scaling the cell architecture and pursuing arraylevel demonstrations] did not disclose endurance figures or fabrication yield, but the graphene channel suggests compatibility with existing 2Dmaterial processes that global fabs are already exploring.
The result is published in the journal Nature.

Saludos.

Cadavre Exquis · « **Respuesta #573 en:** Abril 22, 2025, 08:04:38 am »

Citar

China's CATL Says It Has Overtaken BYD On 5-Minute EV Charging Time
Posted by BeauHD on Monday April 21, 2025 @07:40PM from the game-changer dept.

CATL has unveiled a second-generation Shenxing battery capable of delivering a 520km range in just five minutes of charging, surpassing BYD's recent breakthrough and positioning both Chinese firms ahead of Western rivals in EV battery tech. The battery manufacturer also introduced a sodium-ion battery called Naxtra, offering up to 500km range for EVs and potential to diversify global energy resources. The Financial Times reports:
Citar
The claims by the Chinese battery groups would put them ahead of major western rivals. At present, Tesla vehicles can be charged up to 200 miles (321km) in added range in 15 minutes, while Germany's Mercedes-Benz recently launched its all-electric CLA compact sedan, which can be charged for up to 325km within 10 minutes using a fast-charging station. [...] The second generation of the Shenxing battery, which boasts a range of 800km on one charge, can achieve a peak charging speed of 2.5km per second, the company said at a media event ahead of this week's Shanghai auto show.

"We look forward to collaborating with more industry leaders to push the limits of supercharging through true innovation," said CATL's chief technology officer Gao Huan, adding that he wanted the new batteries to become "the standard for electric vehicles." Analysts at Bernstein said the latest progress meant that charging speeds had more than doubled in the past year and "increased tenfold over the past 3-4 years." Huan said the new Shenxing battery would be installed in more than 67 EV models this year. He later told reporters that energy density would not be sacrificed as a trade-off for fast charging.

During its tech day, CATL also unveiled its new sodium-ion battery, which it said would go into mass production in December. The battery brand called Naxtra is able to give a range of about 200km for a hybrid vehicle and 500km for an electric vehicle, according to Huan. [...] At the event, Huan claimed the new sodium-ion battery would enable the industry's shift from "single resource dependence" to "energy freedom" and reshape the global energy landscape. He added that he was in discussions with several companies about using sodium-ion batteries in their vehicles.

Saludos.

Cadavre Exquis · « **Respuesta #574 en:** Abril 22, 2025, 08:20:06 am »

https://x.com/DatosDame/status/1914215906340712531

Saludos.

Cadavre Exquis · « **Respuesta #575 en:** Abril 24, 2025, 20:18:50 pm »

Citar

AI Tackles Aging COBOL Systems as Legacy Code Expertise Dwindles
Posted by msmash on Thursday April 24, 2025 @01:25PM from the fight-continues dept.

US government agencies and Fortune 500 companies are turning to AI to modernize mission-critical systems built on COBOL, a programming language dating back to the late 1950s. The US Social Security Administration plans a three-year, $1 billion AI-assisted upgrade of its legacy COBOL codebase [alternative source], according to Bloomberg.

Treasury Secretary Scott Bessent has repeatedly stressed the need to overhaul government systems running on COBOL. As experienced programmers retire, organizations face growing challenges maintaining these systems that power everything from banking applications to pension disbursements. Engineers now use tools like ChatGPT and IBM's watsonX to interpret COBOL code, create documentation, and translate it to modern languages.

Saludos.

Cadavre Exquis · « **Respuesta #576 en:** Mayo 01, 2025, 22:39:15 pm »

Citar

China Advances Abandoned US Nuclear Technology
Posted by msmash on Thursday May 01, 2025 @02:18PM from the second-life dept.

Chinese scientists have achieved a significant nuclear breakthrough by successfully refueling a thorium-based reactor while it remains operational, according to reports from Chinese state media.

The experimental 2-megawatt thermal reactor, which came online in June 2024, represents the revival of technology originally developed and abandoned by the United States in the mid-20th century. The milestone was revealed during a closed meeting at the Chinese Academy of Sciences, where project leaders shared results demonstrating the reactor's ability to be refueled without shutdown -- a capability conventional uranium reactors lack.

Though small compared to MIT's 6-megawatt research reactor, this achievement shows China's accelerating nuclear ambitions. The country has surpassed France in nuclear generation and recently approved 10 new reactors worth over $27 billion in investment. This thorium reactor joins other revived nuclear concepts, including molten-salt cooling systems and high-temperature gas reactors, as developers look to the past for solutions to advance nuclear energy's future.

Saludos.

pollo · « **Respuesta #577 en:** Mayo 01, 2025, 22:58:27 pm »

Cita de: Cadavre Exquis en Abril 22, 2025, 08:04:38 am

Citar
China's CATL Says It Has Overtaken BYD On 5-Minute EV Charging Time
Posted by BeauHD on Monday April 21, 2025 @07:40PM from the game-changer dept.

CATL has unveiled a second-generation Shenxing battery capable of delivering a 520km range in just five minutes of charging, surpassing BYD's recent breakthrough and positioning both Chinese firms ahead of Western rivals in EV battery tech. The battery manufacturer also introduced a sodium-ion battery called Naxtra, offering up to 500km range for EVs and potential to diversify global energy resources. The Financial Times reports:
Citar
The claims by the Chinese battery groups would put them ahead of major western rivals. At present, Tesla vehicles can be charged up to 200 miles (321km) in added range in 15 minutes, while Germany's Mercedes-Benz recently launched its all-electric CLA compact sedan, which can be charged for up to 325km within 10 minutes using a fast-charging station. [...] The second generation of the Shenxing battery, which boasts a range of 800km on one charge, can achieve a peak charging speed of 2.5km per second, the company said at a media event ahead of this week's Shanghai auto show.

"We look forward to collaborating with more industry leaders to push the limits of supercharging through true innovation," said CATL's chief technology officer Gao Huan, adding that he wanted the new batteries to become "the standard for electric vehicles." Analysts at Bernstein said the latest progress meant that charging speeds had more than doubled in the past year and "increased tenfold over the past 3-4 years." Huan said the new Shenxing battery would be installed in more than 67 EV models this year. He later told reporters that energy density would not be sacrificed as a trade-off for fast charging.

During its tech day, CATL also unveiled its new sodium-ion battery, which it said would go into mass production in December. The battery brand called Naxtra is able to give a range of about 200km for a hybrid vehicle and 500km for an electric vehicle, according to Huan. [...] At the event, Huan claimed the new sodium-ion battery would enable the industry's shift from "single resource dependence" to "energy freedom" and reshape the global energy landscape. He added that he was in discussions with several companies about using sodium-ion batteries in their vehicles.
Saludos.

La tontería extrema de occidente y sobre todo su putrefacta codicia es la que hace que este avance, mucho más crítico que todas las tecnochorradas de las "Big Five", haya sido un invento chino y que ninguna empresa occidental esté ni cerca de conseguir algo semejante.

El dinero para la especulación y el hype. Denota la calaña de la gente que toma las decisiones.

saturno · « **Respuesta #578 en:** Mayo 02, 2025, 14:36:14 pm »

CENTRALES DE TORIO - EXPLICADO SENCILLO
V-EN https://huabinoliver.substack.com/p/what-is-really-going-to-change-the
V-FR https://lesakerfrancophone.fr/ce-qui-va-vraiment-changer-le-monde-la-percee-chinoise-dans-le-domaine-de-lenergie-nucleaire

Cita de: Cadavre Exquis en Mayo 01, 2025, 22:39:15 pm

Citar
China Advances Abandoned US Nuclear Technology
Posted by msmash on Thursday May 01, 2025 @02:18PM from the second-life dept.

Chinese scientists have achieved a significant nuclear breakthrough by successfully refueling a thorium-based reactor while it remains operational, according to reports from Chinese state media.

The experimental 2-megawatt thermal reactor, which came online in June 2024, represents the revival of technology originally developed and abandoned by the United States in the mid-20th century. The milestone was revealed during a closed meeting at the Chinese Academy of Sciences, where project leaders shared results demonstrating the reactor's ability to be refueled without shutdown -- a capability conventional uranium reactors lack.

Though small compared to MIT's 6-megawatt research reactor, this achievement shows China's accelerating nuclear ambitions. The country has surpassed France in nuclear generation and recently approved 10 new reactors worth over $27 billion in investment. This thorium reactor joins other revived nuclear concepts, including molten-salt cooling systems and high-temperature gas reactors, as developers look to the past for solutions to advance nuclear energy's future.
Saludos.

Cadavre Exquis · « **Respuesta #579 en:** Mayo 14, 2025, 18:44:39 pm »

Citar

GM Says New Battery Chemistry Will Enable 400-Mile Range EVs
Posted by BeauHD on Wednesday May 14, 2025 @03:00AM from the next-gen-tech dept.

General Motors is partnering with LG to develop lithium manganese-rich (LMR) batteries, which are safer, denser, and cheaper than current EV battery tech. The automaker aims to begin U.S. production by 2028 and become the first to deploy LMR cells in electric vehicles. Ford also announced it would start adopting LMR batteries for its EVs, but not until 2030. The Verge reports:
Citar
GM's current crop of electric Chevys and Cadillacs use high-nickel batteries, which supply enough energy for around 300-320 miles of range. The new LMR batteries are denser, with greater space efficiency due to their prismatic shape, enabling up to 400 miles of range, GM says. Prismatic cells are packed flat in rigid cases and are generally thought to be less complex to manufacture than cylindrical cells. Less complexity and cheaper materials will hopefully lead to lower-cost EVs, which has been a significant challenge for the auto industry's shift to electric vehicles.
"The EV growth rate is really dependent on how quickly we can bring the costs down over time," says GM's VP for batteries Kurt Kelty. "And this is the biggest lever we have. Batteries make up roughly 30 to 40 percent of the cost of vehicles. And if you can drop that down significantly like we're doing here, then it ends up being a lower cost to the consumer."

Saludos.

Cadavre Exquis · « **Respuesta #580 en:** Mayo 14, 2025, 18:45:53 pm »

Citar

InventWood Is About To Mass-Produce Wood That's Stronger Than Steel
Posted by BeauHD on Wednesday May 14, 2025 @06:00AM from the super-materials dept.

Longtime Slashdot reader ndsurvivor shares a report from TechCrunch:
Citar
In 2018, Liangbing Hu, a materials scientist at the University of Maryland, devised a way to turn ordinary wood into a material stronger than steel. It seemed like yet another headline-grabbing discovery that wouldn't make it out of the lab. "All these people came to him," said Alex Lau, CEO of InventWood, "He's like, OK, this is amazing, but I'm a university professor. I don't know quite what to do about it."

Rather than give up, Hu spent the next few years refining the technology, reducing the time it took to make the material from more than a week to a few hours. Soon, it was ready to commercialize, and he licensed the technology to InventWood. Now, the startup's first batches of Superwood will be produced starting this summer. "Right now, coming out of this first-of-a-kind commercial plant -- so it's a smaller plant -- we're focused on skin applications," Lau said. "Eventually we want to get to the bones of the building. Ninety percent of the carbon impact from buildings is concrete and steel in the construction of the building." To build the factory, InventWood has raised $15 million in the first close of a Series A round. The round was led by the Grantham Foundation with participation from Baruch Future Ventures, Builders Vision, and Muus Climate Partners, the company exclusively told TechCrunch.
How do they do it? According to TechCrunch, InventWood's Superwood is made by treating regular timber with "food industry" chemicals to remove lignin and modify its structure, then compressing it to increase hydrogen bonding between cellulose fibers. This densification makes the wood up to 10 times stronger than natural wood, with a higher strength-to-weight ratio than steel. "You end up with something that looks like these richer, tropical hardwoods," Lau added.

Saludos.

Cadavre Exquis · « **Respuesta #581 en:** Mayo 14, 2025, 21:00:02 pm »

Evolución del Optimus de Tesla:

2021

Guardian News · 2021.08.20 · Elon Musk unveils plan for 'Tesla Bot' with man dancing in a bodysuit

2025

Tesla · 2025.05.14 · Me in my room at 3am

Saludos.

Cadavre Exquis · « **Respuesta #582 en:** Mayo 15, 2025, 07:15:40 am »

Citar

Google DeepMind Creates Super-Advanced AI That Can Invent New Algorithms
Posted by BeauHD on Wednesday May 14, 2025 @11:30PM from the new-and-improved dept.

An anonymous reader quotes a report from Ars Technica:
Citar
Google's DeepMind research division claims its newest AI agent marks a significant step toward using the technology to tackle big problems in math and science. The system, known as AlphaEvolve, is based on the company's Gemini large language models (LLMs), with the addition of an "evolutionary" approach that evaluates and improves algorithms across a range of use cases. AlphaEvolve is essentially an AI coding agent, but it goes deeper than a standard Gemini chatbot. When you talk to Gemini, there is always a risk of hallucination, where the AI makes up details due to the non-deterministic nature of the underlying technology. AlphaEvolve uses an interesting approach to increase its accuracy when handling complex algorithmic problems.

According to DeepMind, this AI uses an automatic evaluation system. When a researcher interacts with AlphaEvolve, they input a problem along with possible solutions and avenues to explore. The model generates multiple possible solutions, using the efficient Gemini Flash and the more detail-oriented Gemini Pro, and then each solution is analyzed by the evaluator. An evolutionary framework allows AlphaEvolve to focus on the best solution and improve upon it. Many of the company's past AI systems, for example, the protein-folding AlphaFold, were trained extensively on a single domain of knowledge. AlphaEvolve, however, is more dynamic. DeepMind says AlphaEvolve is a general-purpose AI that can aid research in any programming or algorithmic problem. And Google has already started to deploy it across its sprawling business with positive results.
DeepMind's AlphaEvolve AI has optimized Google's Borg cluster scheduler, reducing global computing resource usage by 0.7% -- a significant cost saving at Google's scale. It also outperformed specialized AI like AlphaTensor by discovering a more efficient algorithm for multiplying complex-valued matrices. Additionally, AlphaEvolve proposed hardware-level optimizations for Google's next-gen Tensor chips.

The AI remains too complex for public release but that may change in the future as it gets integrated into smaller research tools.

Citar

AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

AlphaEvolve team · 14 May 2025

New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators

Large language models (LLMs) are remarkably versatile. They can summarize documents, generate code or even brainstorm new ideas. And now we’ve expanded these capabilities to target fundamental and highly complex problems in mathematics and modern computing.

Today, we’re announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas.

AlphaEvolve enhanced the efficiency of Google's data centers, chip design and AI training processes — including training the large language models underlying AlphaEvolve itself. It has also helped design faster matrix multiplication algorithms and find new solutions to open mathematical problems, showing incredible promise for application across many areas.

Designing better algorithms with large language models

In 2023, we showed for the first time that large language models can generate functions written in computer code to help discover new and provably correct knowledge on an open scientific problem. AlphaEvolve is an agent that can go beyond single function discovery to evolve entire codebases and develop much more complex algorithms.

AlphaEvolve leverages an ensemble of state-of-the-art large language models: our fastest and most efficient model, Gemini Flash, maximizes the breadth of ideas explored, while our most powerful model, Gemini Pro, provides critical depth with insightful suggestions. Together, these models propose computer programs that implement algorithmic solutions as code.

Diagram showing how the prompt sampler first assembles a prompt for the language models, which then generate new programs. These programs are evaluated by evaluators and stored in the programs database. This database implements an evolutionary algorithm that determines which programs will be used for future prompts.

AlphaEvolve verifies, runs and scores the proposed programs using automated evaluation metrics. These metrics provide an objective, quantifiable assessment of each solution’s accuracy and quality. This makes AlphaEvolve particularly helpful in a broad range of domains where progress can be clearly and systematically measured, like in math and computer science.

Optimizing our computing ecosystem

Over the past year, we’ve deployed algorithms discovered by AlphaEvolve across Google’s computing ecosystem, including our data centers, hardware and software. The impact of each of these improvements is multiplied across our AI and computing infrastructure to build a more powerful and sustainable digital ecosystem for all our users.

Diagram showing how AlphaEvolve helps Google deliver a more efficient digital ecosystem, from data center scheduling and hardware design to AI model training.

Improving data center scheduling

AlphaEvolve discovered a simple yet remarkably effective heuristic to help Borg orchestrate Google's vast data centers more efficiently. This solution, now in production for over a year, continuously recovers, on average, 0.7% of Google’s worldwide compute resources. This sustained efficiency gain means that at any given moment, more tasks can be completed on the same computational footprint. AlphaEvolve's solution not only leads to strong performance but also offers significant operational advantages of human-readable code: interpretability, debuggability, predictability and ease of deployment.

Assisting in hardware design

AlphaEvolve proposed a Verilog rewrite that removed unnecessary bits in a key, highly optimized arithmetic circuit for matrix multiplication. Crucially, the proposal must pass robust verification methods to confirm that the modified circuit maintains functional correctness. This proposal was integrated into an upcoming Tensor Processing Unit (TPU), Google’s custom AI accelerator. By suggesting modifications in the standard language of chip designers, AlphaEvolve promotes a collaborative approach between AI and hardware engineers to accelerate the design of future specialized chips.

Enhancing AI training and inference

AlphaEvolve is accelerating AI performance and research velocity. By finding smarter ways to divide a large matrix multiplication operation into more manageable subproblems, it sped up this vital kernel in Gemini’s architecture by 23%, leading to a 1% reduction in Gemini's training time. Because developing generative AI models requires substantial computing resources, every efficiency gained translates to considerable savings. Beyond performance gains, AlphaEvolve significantly reduces the engineering time required for kernel optimization, from weeks of expert effort to days of automated experiments, allowing researchers to innovate faster.

AlphaEvolve can also optimize low level GPU instructions. This incredibly complex domain is usually already heavily optimized by compilers, so human engineers typically don't modify it directly. AlphaEvolve achieved up to a 32.5% speedup for the FlashAttention kernel implementation in Transformer-based AI models. This kind of optimization helps experts pinpoint performance bottlenecks and easily incorporate the improvements into their codebase, boosting their productivity and enabling future savings in compute and energy.

Advancing the frontiers in mathematics and algorithm discovery
AlphaEvolve can also propose new approaches to complex mathematical problems. Provided with a minimal code skeleton for a computer program, AlphaEvolve designed many components of a novel gradient-based optimization procedure that discovered multiple new algorithms for matrix multiplication, a fundamental problem in computer science.

A list of changes proposed by AlphaEvolve to discover faster matrix multiplication algorithms. In this example, AlphaEvolve proposes extensive changes across several components, including the optimizer and weight initialization, the loss function, and hyperparameter sweep. These changes are highly non-trivial, requiring 15 mutations during the evolutionary process.

AlphaEvolve’s procedure found an algorithm to multiply 4x4 complex-valued matrices using 48 scalar multiplications, improving upon Strassen’s 1969 algorithm that was previously known as the best in this setting. This finding demonstrates a significant advance over our previous work, AlphaTensor, which specialized in matrix multiplication algorithms, and for 4x4 matrices, only found improvements for binary arithmetic.

To investigate AlphaEvolve’s breadth, we applied the system to over 50 open problems in mathematical analysis, geometry, combinatorics and number theory. The system’s flexibility enabled us to set up most experiments in a matter of hours. In roughly 75% of cases, it rediscovered state-of-the-art solutions, to the best of our knowledge.

And in 20% of cases, AlphaEvolve improved the previously best known solutions, making progress on the corresponding open problems. For example, it advanced the kissing number problem. This geometric challenge has fascinated mathematicians for over 300 years and concerns the maximum number of non-overlapping spheres that touch a common unit sphere. AlphaEvolve discovered a configuration of 593 outer spheres and established a new lower bound in 11 dimensions.

The path forward

AlphaEvolve displays the progression from discovering algorithms for specific domains to developing more complex algorithms for a wide range of real-world challenges. We’re expecting AlphaEvolve to continue improving alongside the capabilities of large language models, especially as they become even better at coding.

Together with the People + AI Research team, we’ve been building a friendly user interface for interacting with AlphaEvolve. We’re planning an Early Access Program for selected academic users and also exploring possibilities to make AlphaEvolve more broadly available. To register your interest, please complete this form.

While AlphaEvolve is currently being applied across math and computing, its general nature means it can be applied to any problem whose solution can be described as an algorithm, and automatically verified. We believe AlphaEvolve could be transformative across many more areas such as material science, drug discovery, sustainability and wider technological and business applications.

Saludos.

Cadavre Exquis · « **Respuesta #583 en:** Mayo 15, 2025, 20:03:11 pm »

Dot CSV Lab · 2025.05.15 · Google's New AI Automates Math Discoveries - AlphaEvolve

Saludos.

Cadavre Exquis · « **Respuesta #584 en:** Mayo 18, 2025, 08:37:24 am »

Operador Nuclear · 2025.05.17 · España Necesita Energia Nuclear / Spain Needs Nuclear Energy

Saludos.

Transición Estructural .NET

Noticias:

Blog

Últimos mensajes

Temas mas recientes

Autor Tema: STEM (Leído 347402 veces)

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

pollo

Re:STEM

saturno

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM

Cadavre Exquis

Re:STEM