Dedicated to the 100th anniversary of Lev Korolev,
patriarch of system programming in the USSR
Springer technical sponsorship is approved!
The metric approach as an element of artificial intelligence for scheduling
problems
On one of the fundamental problem of NP-hard in the strong sense scheduling theory 1|r_j|L_{max} it is shown that all solvable cases are expressed using only two matrices (the identity matrix and the Jordan matrix).
In fact, our goal is aimed at obtaining a mathematical justification for artificial intelligence methods. It is necessary to use previously acquired knowledge, skills about the problem, algorithms for solvable subcases of the problem most effectively. In addition, it is necessary to estimate the error of the obtained approximate solution to the problem. The metric approach allows us to estimate the absolute error of the optimal value of objective function.
Schematically, this can be represented as follows. We have a current situation of the problem (point A in a multidimensional space). For example, for single machine scheduling theory problems, this is a point in the 3n-dimensional space, where n — the number of jobs. We know the polynomially solvable subcases of the problem. They are always bounded by some system of linear constraints. We find the projection of initial point A in our metric onto the solvable subcaseby solving a linear programming problem. As result, we obtain point B, at which we can find an approximate solution in polynomial time with a minimal upper bound on the absolute error of the objective function value at points A and B.
FLOPs Are Abundant, But Bandwidth Is Not: Rethinking Data Movement in 100K GPU Clusters
Maxim Shevtsov is Performance Optimization Expert originally specializing in Graphics, Computer Vision and heterogeneous computing. His current focus is high-throughput LLM inference and enterprise AI workloads, maximizing hardware utilization across NPUs, for performance-critical AI deployments. He is senior Expert in Huawei, responsible for the whole Inference-server direction of the entire Russia Research Center.
Large Language Models Inference at Scale: Scheduling Challenges
Today the LLMs grows very rapidly with size, often using tens and hundreds of GPUs, forcing careful model sharding, with associated distributed challenges like communication bottlenecks.
Dynamic input/outputs lengths, optimizations like MoE's conditional execution create severe device underutilization, while thousands of small operations (e.g., attention heads) drown may in kernel scheduling overhead. At the same time the hardware vendors offer a scale-up composition with up of tens of servers, delivering hundreds of PFLOPs of compute, totaling several terabytes of on-chip memory, and terabytes per second of memory bandwidth.
Today these two trends overlap, thus new challenges emerge, requiring fine-grained routing, synchronization, and load balancing across hundreds of devices.
This talk describes new challenges that production meets, and associated shifts in engineering paradigms.
Topological methods for traffic analysis in computer networks
One promising area of network traffic analysis is the use of artificial neural networks based on the Kolmogorov-Arnold theorem in combination with wavelet analysis and topological data analysis. This approach is driven by the fact that modern neural network architectures, such as Kolmogorov-Arnold Networks (KANs), demonstrate significant potential for modeling complex nonlinear dependencies while maintaining high interpretability compared to traditional MLP networks. This opens new horizons in the construction of interpretable and effective models for network traffic analysis.