NTT Technical Review · April 2026

Resource Allocation with Heterogeneous Resources and Parallelism in Disaggregated Computing

Original abstract: Disaggregated computing improves resource utilization by pooling central processing units, memory, and accelerators and flexibly assigning heterogeneous resources to each service component. To maximize these benefits, resource allocation and routing must be decided efficiently before execution. This article introduces a practical-time resource allocation method that models heterogeneous resource characteristics and parallel processing effects. Simulations in heterogeneous disaggregated systems show that this method meets service-performance requirements while reducing required resources by 28–51% on average compared with conventional methods.

🔗 Read original article on NTT Technical Review →

Simplified Summary / Resumen Simplificado / 簡易解説

Introduction

Imagina que tienes una cocina compartida con muchos chefs especializados: uno es experto en repostería, otro en carnes, otro en salsas. Si cada chef tuviera su propia cocina privada aunque no la usara todo el tiempo, sería un desperdicio enorme. Pero si comparten una gran cocina común y cada uno toma los utensilios y el espacio que necesita justo cuando lo necesita, el resultado es mucho más eficiente. Eso es, en esencia, lo que hace la computación desagregada: en lugar de que cada servidor tenga sus propios procesadores y memorias fijos, todos estos componentes se ponen en un gran 'almacén compartido' y se asignan dinámicamente según lo que cada tarea necesita. Esto es especialmente importante hoy en día porque la inteligencia artificial exige una cantidad enorme de cómputo, y los procesadores tradicionales ya no son suficientes por sí solos. Para tareas de IA se usan aceleradores especializados, como las GPUs (tarjetas gráficas adaptadas para cálculos masivos en paralelo) o las FPGAs (chips reprogramables que se ajustan a tareas específicas). La computación desagregada permite combinar estos recursos de forma flexible, pero surge una pregunta clave: ¿cómo decidir, de manera rápida e inteligente, qué recurso hace qué tarea y cómo se comunican entre sí? Este artículo responde exactamente a esa pregunta. Los investigadores de NTT desarrollaron un método que asigna automáticamente los recursos correctos a cada parte de un servicio, tomando en cuenta las diferencias entre tipos de hardware y la posibilidad de dividir el trabajo en paralelo entre varios componentes. El resultado es impresionante: se puede lograr el mismo rendimiento usando entre un 28% y un 51% menos de recursos que con los métodos anteriores. Eso significa menos coste, menos energía y una infraestructura más sostenible.

Key Concepts

Computación desagregada

Un enfoque donde los componentes de una computadora (procesadores, memoria, aceleradores) se separan físicamente y se colocan en una especie de 'alberca compartida', de modo que cualquier tarea pueda tomar exactamente los recursos que necesita, sin desperdiciar nada.

Acelerador (GPU / FPGA)

Chips especializados diseñados para hacer ciertos tipos de cálculos mucho más rápido que un procesador normal. Las GPUs son como autobuses de muchos asientos para tareas paralelas; las FPGAs son como plastilina digital que se puede moldear para cada tarea específica.

Asignación de recursos

El proceso de decidir qué componente de hardware ejecutará qué parte del trabajo. Como un director de orquesta que asigna cada instrumento al músico más adecuado para cada pasaje de la sinfonía.

Cadena de procesamiento

Una serie de pasos ordenados que un servicio debe seguir, uno tras otro (o en paralelo). Como una línea de ensamblaje en una fábrica donde cada estación hace una parte del producto final.

Procesamiento en paralelo

Dividir una tarea grande entre varios recursos que trabajan al mismo tiempo, como varios cocineros preparando distintas partes de un banquete simultáneamente para terminar más rápido.

Enrutamiento

La decisión sobre cómo viajan los datos entre los distintos componentes de hardware. Como planificar las rutas de los camiones en una red de distribución para que los paquetes lleguen rápido y sin atascos.

What to Expect in the Full Article

El artículo técnico que estás a punto de leer describe en detalle cómo los investigadores modelaron matemáticamente las características de distintos tipos de hardware y el efecto del procesamiento paralelo, y cómo construyeron un algoritmo que encuentra la mejor asignación de recursos en un tiempo razonable. Encontrarás diagramas del sistema, ecuaciones y resultados de simulaciones. No te preocupes si algunos términos matemáticos te resultan densos: la lógica central es siempre la misma que los analogías que acabas de leer, y la sección de resultados al final te mostrará de forma concreta cuánto mejora este método respecto a los enfoques anteriores.

Read original → ntt-review.jp

Disclaimer (🇪🇸): Este sitio es un proyecto independiente de divulgación educativa. Los resúmenes son generados por IA a partir de artículos de NTT Technical Review. No está afiliado a NTT. El objetivo es facilitar el entendimiento previo a la lectura del artículo original.

Introduction

Think of a modern data center like a giant, shared kitchen. Instead of every chef having their own private kitchen that sits empty half the time, everyone shares one big kitchen and picks up exactly the tools and counter space they need for each dish. That shared kitchen is the idea behind disaggregated computing: rather than each server owning its own fixed set of processors and memory, all those components go into a common pool and get assigned dynamically to whatever task needs them most. The result? Far less waste and far more flexibility. This matters more than ever because artificial intelligence has created a voracious appetite for computation that ordinary processors can no longer satisfy alone. Modern AI workloads rely on specialized chips — GPUs (graphics processing units, which are brilliant at doing millions of calculations simultaneously) and FPGAs (chips that can be reprogrammed like digital clay to fit a specific job). Disaggregated computing lets you mix and match these powerful components on the fly. But that flexibility comes with a thorny puzzle: how do you quickly and intelligently decide which chip handles which part of a job, and how should those chips talk to each other? That is precisely the problem this research team at NTT set out to solve. They built a method that automatically figures out the best assignment of hardware resources to every step of a service — factoring in the different strengths of each type of chip and the benefits of splitting heavy work across multiple chips running in parallel. The payoff is striking: the same service performance can be achieved using 28–51% fewer resources than older approaches. That translates directly into lower costs, lower energy consumption, and a greener, more sustainable infrastructure.

Key Concepts

Disaggregated computing

A design philosophy where the parts of a computer — processors, memory, and specialty chips — are physically separated and placed in a shared pool, so any task can borrow exactly what it needs, when it needs it, without anything sitting idle.

Accelerator (GPU / FPGA)

Specialized chips that can perform certain types of calculations far faster than a general-purpose processor. GPUs are like express buses — they carry huge numbers of simple calculations at once. FPGAs are like programmable Lego — they can be reshaped to match almost any specific job.

Resource allocation

The process of deciding which piece of hardware will handle which part of a workload. Think of it as a conductor assigning each instrument to the right musician so the symphony sounds its best.

Processing chain (virtual function request)

A service broken into an ordered sequence of steps that must be completed one after another — like an assembly line where each station adds something to the product before passing it along.

Parallel processing

Splitting a big task across multiple chips that all work at the same time, like having several chefs each prepare a different course of a dinner simultaneously so the meal is ready faster.

Routing

Deciding the path that data takes as it travels between chips during processing. Just as a GPS finds the fastest route across a city, routing finds the fastest, least-congested path through the network of hardware components.

What to Expect in the Full Article

The full technical article dives into the mathematical framework the researchers built to model how different hardware types behave and how parallel processing changes the equation. You will encounter system diagrams, formal equations, and simulation results. Do not be discouraged by the formulas — the underlying logic is always the same as the analogies above. Pay special attention to the simulation results section, where the 28–51% resource savings come to life with concrete numbers, and to the discussion of how the method finds good solutions fast enough to be practically useful in real deployments.

Read original → ntt-review.jp

Disclaimer (🇺🇸): This site is an independent educational project. Summaries are AI-generated from NTT Technical Review articles. Not affiliated with NTT. The goal is to aid understanding before reading the original article.

Introduction

大きな共有キッチンを想像してみてください。シェフ一人ひとりが専用のキッチンを持っていると、使われない時間に設備が無駄になってしまいます。でも全員が一つの大きなキッチンを共有し、必要なときに必要な道具とスペースを取り出して使えるとしたら、ずっと効率的ですよね。これが「分解型コンピューティング（Disaggregated Computing）」の基本的なアイデアです。サーバーごとに固定のプロセッサやメモリを持たせるのではなく、すべての部品を共通の「プール」に入れておき、必要なタスクに必要な分だけ動的に割り当てるのです。その結果、無駄が大幅に減り、柔軟性が飛躍的に高まります。このアプローチが今とりわけ重要なのは、人工知能（AI）が爆発的に成長し、普通のプロセッサだけでは到底賄えないほどの計算能力を必要とするようになったからです。最新のAIワークロードには、GPU（グラフィックス処理ユニット：膨大な並列計算を得意とする特殊なチップ）やFPGA（特定の用途向けに書き換えられる「デジタル粘土」のようなチップ）といったアクセラレータが欠かせません。分解型コンピューティングはこれらの強力なチップを柔軟に組み合わせることを可能にしますが、そこに難しい問いが生まれます。「どのチップがどの処理を担当し、チップ同士はどのように通信すべきか」を、素早く賢く決めるにはどうすればよいのでしょうか？この論文は、まさにその問いへの答えです。NTTの研究チームは、サービスの各ステップに最適なハードウェアリソースを自動的に割り当てる手法を開発しました。異なるチップの特性の違いや、複数のチップで処理を並列に行う効果を考慮することで、従来の手法に比べて28〜51%少ないリソースで同じサービス性能を達成できることを示しました。これはコスト削減や省エネルギーに直結し、より持続可能なインフラへの大きな一歩です。

Key Concepts

分解型コンピューティング（Disaggregated Computing）

プロセッサ・メモリ・アクセラレータといったコンピュータの部品を物理的に切り離し、共通の「プール」に集めておく設計思想。各タスクが必要なときに必要な分だけリソースを借りられるため、無駄な待機時間がなくなる。

アクセラレータ（GPU・FPGA）

特定の計算を汎用プロセッサよりもはるかに高速にこなす専用チップ。GPUは大量の計算を同時並行で処理する「大型バス」のようなもの。FPGAは特定の用途に合わせて回路を書き換えられる「デジタル粘土」のようなもの。

リソース割り当て（Resource Allocation）

どのハードウェアがどの処理を担当するかを決めるプロセス。オーケストラの指揮者が各楽器を最適な奏者に割り当てるように、各タスクを最も得意なチップに振り分ける作業。

処理チェーン（仮想機能リクエスト）

サービスを順番に実行すべき複数のステップに分けたもの。工場の組み立てラインのように、各ステーションが処理を加えながら次へ受け渡す流れ。

並列処理（Parallel Processing）

大きなタスクを複数のチップに分割し、同時に処理させること。複数のシェフがコース料理の各皿を同時に仕上げて、より早く食事を提供するイメージ。

ルーティング（Routing）

処理中にデータがチップ間をどの経路で移動するかを決めること。カーナビが街の中で最速・最も混雑していない道を選ぶように、データも最適な経路を通る必要がある。

What to Expect in the Full Article

これから読む技術論文では、異なる種類のハードウェアの特性と並列処理の効果を数学的にモデル化した枠組みが詳しく説明されています。システム図や数式、シミュレーション結果が登場しますが、難しく感じたときはここで読んだアナロジーに立ち返ってみてください。特に注目してほしいのは、28〜51%のリソース削減という成果が具体的な数値で示されるシミュレーション結果のセクションと、現実の環境でも実用的な時間内に解を導き出せることを示す部分です。これらを読めば、この研究がなぜ実用上重要なのかがより鮮明に伝わるはずです。

Read original → ntt-review.jp

Disclaimer (🇯🇵): このサイトは独立した教育目的のプロジェクトです。要約はNTT技術ジャーナルの記事からAIが生成したものです。NTTとは無関係です。目的は元の記事を読む前の理解を助けることです。

← Back to all summaries