Global ETD Search

1	The Comparative Effectiveness of After-Action Review in Co-located and Distributed Team Training Environments Jarrett, Steven 2012 August 1900 (has links) The team-training literature provides favorable support for the after-action review (AAR)?s ability to improve cognitive, skill, and attitudinal outcomes in co-located and distributed environments. However, the comparative effectiveness of co-located and distributed AARs is unknown. Thus, the objective of the present study was to investigate the comparative effectiveness of co-located and distributed AARs. The present study examined the AAR?s effect on performance, declarative knowledge, team-efficacy, team voice, team cohesion, and team-level reactions. Data were obtained from 492 undergraduate students (47.66% female) assigned to 123 4-person teams who participated in a team training protocol using a 3 (type of AAR review: non-AAR versus subjective AAR versus objective AAR) x 2 (geographic dispersion: co-located and distributed training environments) x 3 (sessions) repeated measures design. The results indicate that AAR teams had significantly higher performance scores than the non-AAR teams. In addition, the AAR teams had higher perceptions of team-efficacy and higher levels of team cohesion than the non-AAR teams. With the exception of team-level reactions, there were no other significant differences between the distributed AAR and co-located AAR conditions. Similarly, there were no significant differences across any of the outcome variables between the objective and subjective AAR conditions, indicating that the type of AAR did not impact the results of the training. The findings of the present study highlight several practical and scientific implications that should be considered regarding AAR training. Primarily, regardless of the training environment or type of AAR, AAR training remains an effective intervention at increasing performance and attitudinal-based outcomes. In addition, the results suggest that the use of distributed AARs does not engender the proposed process losses that were hypothesized. Thus, the use of this training to reduce administrative costs may be a viable option for geographically dispersed organizations. Finally, practitioners should evaluate the extent to which increasing the amount of technology to allow for a more objective performance review, is providing the intended benefit to the trainees. The empirical research has consistently demonstrated that the use of objective review systems provides little to no benefit to the trainees. Future research is needed to determine the generalizability of these findings to other tasks, domains, team types, and levels of expertise. After-Action Review Distributed Training Team Training Objectivity
2	Tackling the Communication Bottlenecks of Distributed Deep Learning Training Workloads Ho, Chen-Yu 08 1900 (has links) Deep Neural Networks (DNNs) find widespread applications across various domains, including computer vision, recommendation systems, and natural language processing. Despite their versatility, training DNNs can be a time-consuming process, and accommodating large models and datasets on a single machine is often impractical. To tackle these challenges, distributed deep learning (DDL) training workloads have gained increasing significance. However, DDL training introduces synchronization requirements among nodes, and the mini-batch stochastic gradient descent algorithm heavily burdens network connections. This dissertation proposes, analyzes, and evaluates three solutions addressing the communication bottleneck in DDL learning workloads. The first solution, SwitchML, introduces an in-network aggregation (INA) primitive that accelerates DDL workloads. By aggregating model updates from multiple workers within the network, SwitchML reduces the volume of exchanged data. This approach, which incorporates switch processing, end-host protocols, and Deep Learning frameworks, enhances training speed by up to 5.5 times for real-world benchmark models. The second solution, OmniReduce, is an efficient streaming aggregation system designed for sparse collective communication. It optimizes performance for parallel computing applications, such as distributed training of large-scale recommendation systems and natural language processing models. OmniReduce achieves maximum effective bandwidth utilization by transmitting only nonzero data blocks and leveraging fine-grained parallelization and pipelining. Compared to state-of-the-art TCP/IP and RDMA network solutions, OmniReduce outperforms them by 3.5 to 16 times, delivering significantly better performance for network-bottlenecked DNNs, even at 100 Gbps. The third solution, CoInNetFlow, addresses congestion in shared data centers, where multiple DNN training jobs compete for bandwidth on the same node. The study explores the feasibility of coflow scheduling methods in hierarchical and multi-tenant in-network aggregation communication patterns. CoInNetFlow presents an innovative utilization of the Sincronia priority assignment algorithm. Through packet-level DDL job simulation, the research demonstrates that appropriate weighting functions, transport layer priority scheduling, and gradient compression on low-priority tensors can significantly improve the median Job Completion Time Inflation by over $70\%$. Collectively, this dissertation contributes to mitigating the network communication bottleneck in distributed deep learning. The proposed solutions can enhance the efficiency and speed of distributed deep learning systems, ultimately improving the performance of DNN training across various domains. deep neural networks deep learning distributed training in-network aggregation communication bottleneck streaming aggregation collective communication gradient compression sparsity coflow scheduling multi-tenancy congestion hierarchical aggregation

Search results

The Comparative Effectiveness of After-Action Review in Co-located and Distributed Team Training Environments

Tackling the Communication Bottlenecks of Distributed Deep Learning Training Workloads