Spelling suggestions: "subject:"computers."" "subject:"eomputers.""
371 |
Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer SystemsKambadur, Melanie Rae January 2016 (has links)
Computer systems have become increasingly diverse and specialized in recent years. This complexity supports a wide range of new computing uses and users, but is not without cost: it has become difficult to maintain the efficiency of contemporary general purpose computing systems. Computing inefficiencies, which include nonoptimal runtimes, excessive energy use, and limits to scalability, are a serious problem that can result in an inability to apply computing to solve the world's most important problems. Beyond the complexity and vast diversity of modern computing platforms and applications, a number of factors make improving general purpose efficiency challenging, including the requirement that multiple levels of the computer system stack be examined, that legacy hardware devices and software may stand in the way of achieving efficiency, and the need to balance efficiency with reusability, programmability, security, and other goals.
This dissertation presents five case studies, each demonstrating different ways in which the measurement of emerging systems can provide actionable advice to help keep general purpose computing efficient. The first of the five case studies is Parallel Block Vectors, a new profiling method for understanding parallel programs with a fine-grained, code-centric perspective aids in both future hardware design and in optimizing software to map better to existing hardware. Second is a project that defines a new way of measuring application interference on a datacenter's worth of chip-multiprocessors, leading to improved scheduling where applications can more effectively utilize available hardware resources. Next is a project that uses the GT-Pin tool to define a method for accelerating the simulation of GPGPUs, ultimately allowing for the development of future hardware with fewer inefficiencies. The fourth project is an experimental energy survey that compares and combines the latest energy efficiency solutions at different levels of the stack to properly evaluate the state of the art and to find paths forward for future energy efficiency research. The final project presented is NRG-Loops, a language extension that allows programs to measure and intelligently adapt their own power and energy use.
|
372 |
Smart Memory: An Inexact Content-Addressable MemoryLee, Jack 12 February 1993 (has links)
The function of a Content-Addressable Memory (CAM) is to efficiently search the information stored in the memory, by using hardware rather than software with a corresponding improvement in searching speed. This hardware allows a parallel search by matching the data stored in memory to a search key rather than sequentially searching address by address as is done in a Random Access Memory. Although existing CAMs are more efficient in finding relevant information than RAM, there are additional improvements that can be made to further improve its efficiency. For example, previous CAMs use a word parallel searching scheme that can only identify exact matches. To find the best (closest) match, previous CAMs had to use bit serial approaches. Although still more efficient than RAM searching, these CAMs were limited by the word size (bit width) of the memory. Responding to this inefficiency, the CAM described in this thesis improves best-fit searching by using analog design in combination with digital design. This design retains a mismatch line to collect the result of the comparison of each bit of a word which is decoded by a simple flash A/D. This means that after a single operation the best-fit plus all words with zero to three bits of mismatch, are determined. This word/bit parallel searching makes this CAM more efficient than existing CAMs. The best-fit function of this CAM is good for database retrieval, communications and error correction circuitry. By using the high speed searching and the inexact match feature, this CAM also provides efficient sorting and set operations. The accumulated searching time is shortened when compared to regular CAM and RAM. The inexact CAM in this thesis is designed using mixed analog/digital design in a 2~ CMOS technology.
|
373 |
MatRISC : a RISC multiprocessor for matrix applications / Andrew James Beaumont-Smith.Beaumont-Smith, Andrew James January 2001 (has links)
"November, 2001" / Errata on back page. / Includes bibliographical references (p. 179-183) / xxii, 193 p. : ill. (some col.), plates (col.) ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / This thesis proposes a highly integrated SOC (system on a chip) matrix-based parallel processor which can be used as a co-processor when integrated into the on-chip cache memory of a microprocessor in a workstation environment. / Thesis (Ph.D.)--University of Adelaide, Dept. of Electrical and Electronic Engineering, 2002
|
374 |
A usability comparison of PDA-based quizzes and paper-and-pencil quizzesSegall, Noa 17 July 2003 (has links)
In the last few years, many schools and universities have incorporated personal
digital assistants (PDAs) into their teaching curricula, in an attempt to enhance
students' learning experience and reduce instructors' workload. One of the
most common uses of PDAs in the classroom is as a test administrator. This
study compared the usability effectiveness, efficiency, and satisfaction of a
PDA-based quiz application to that of standard paper-and-pencil quizzes in a
university course in order to determine whether it was advisable to invest time
and money in PDA-based testing. The effects of computer anxiety, age, gender,
and ethnicity on usability were also evaluated, to ascertain that these factors do
not discriminate against individuals taking PDA-based tests.
Five quizzes were administered to students participating in an engineering
introductory course. Of these, students took two PDA-based quizzes and three
paper-and-pencil quizzes. One PDA-based quiz and one paper-and-pencil quiz
were compared in terms of their effectiveness, measured as students' quiz
scores and through a mental workload questionnaire; their efficiency, which
was the time it took students to complete each quiz; and their satisfaction,
evaluated using a subjective user satisfaction questionnaire. Computer anxiety
was also measured, using an additional questionnaire.
It was hypothesized that the PDA-based quiz would be more effective and
efficient than the paper-and-pencil quiz and that students' satisfaction with the
PDA-based quiz would be greater. The study showed the PDA-based quiz to be
more efficient, that is, students completed it in less time than they needed to
complete the paper-and-pencil quiz. No differences in effectiveness and
satisfaction were found between the two quiz types.
It was also hypothesized that for PDA-based quizzes, as computer anxiety
increased, effectiveness and satisfaction would decrease; for paper-and-pencil
quizzes there would be no relationship between computer anxiety and
effectiveness and no relationship between computer anxiety and satisfaction.
Findings showed an increase in quiz score (increase in effectiveness) and an
increase in mental workload (decrease in effectiveness) as computer anxiety
increased for both quiz types. No relationship was found between computer
anxiety and satisfaction for either paper-and-pencil or PDA-based quizzes.
The final hypothesis suggested that user satisfaction would be positively
correlated with effectiveness (quiz score and mental workload) for both PDA-based
and paper-and-pencil quizzes. No relationship was found between quiz
score and satisfaction for either quiz type. User satisfaction was positively
correlated with mental workload, regardless of quiz type.
The usability comparison of paper-and-pencil and PDA-based quizzes found
the latter to be equal, if not superior, to the former. The effort students put into
taking the quiz was the same, regardless of administration method, and scores
were not affected. In addition, different demographic groups performed almost
equally well in both quiz types (white students' PDA-based quiz scores were
slightly lower than those of the other ethnic groups). Computer anxiety was not
affected by the quiz type. For these reasons, as well as other advantages to both
students (e.g. real-time scoring) and teachers (e.g. spending less time on
grading), PDAs are an attractive test administration option for schools and
universities. / Graduation date: 2004
|
375 |
Memory optimization for a parallel sorting hardware architectureBeyer, Dale A. 22 May 1997 (has links)
Sorting is one of the more computationally intensive tasks a computer performs.
One of the most effective ways to speed up the task of sorting is by using parallel
algorithms. When implementing a parallel algorithm, the designer has to make several
decisions. Among the decisions are the algorithm and the physical implementation of the
algorithm. A dedicated hardware solution is often physically quicker than a software
solution.
In this thesis, we will investigate the optimization of a hardware implementation
of max-min sort. I propose an optimization to the data structures used in the algorithm.
The new data structure allows quicker sorting by changing the basic workings of the
max-min sort. The results are presented by comparing the new data structure with the
original data structure. The thesis also discusses the design and performance issues
related to implementing the algorithm in hardware. / Graduation date: 1998
|
376 |
Data decomposition and load balancing for networked data-parallel processingCrandall, Phyllis E. 19 April 1994 (has links)
Graduation date: 1994
|
377 |
Allocation of SISAL program graphs to processors using BLASRaisinghani, Manoj H. 07 April 1994 (has links)
There are a number of well known techniques for extracting parallelism from a
given program. They range from hardware implementations, building restructuring
compilers or reorganizing of programs so as to specify all the available parallelism. The
success rate of any of the known techniques is rather poor over all types of programs.
This has pushed the research community to explore new languages and design different
architectures to exploit program parallelism.
The principles of dataflow architectures have addressed the problem of exploiting
parallelism in systems by executing dataflow graphs. These graphs or programs represent
data dependencies among instructions and execution of the graph proceeds in a data-driven
manner. That is, an instruction is executed as soon as all its operands are
available, without waiting for any program counter to sequence its execution, as is the
case in conventional von Neumann architectures.
In this thesis, data flow graphs are generated during the intermediate compilation of
a functional language called SISAL (Streams and Iterations in a Single Assignment
Language). The Intermediate Form (IFl) is a graphical language consisting of multiple
acyclic function graphs that represent a given program. Each graph consists of a
sequence of nodes and edges. The nodes specify the operation and the edges indicate the
dependencies between the nodes. The graphs are further connected to each other by
means of implicit dependencies.
The Automator package developed in this project, preprocesses these multiple IF1
graphs and translates them into a single connected graph. It converts all implicit
dependencies into actual ones. Additionally, complex language constructs like For All,
loops and if-then-else are treated in special ways together with their nested levels by the
Automator. There is virtually no limit to the number of nested levels that can be
translated by this package.
The Automator's prime contribution is in translating real programs written in SISAL
into a specified format required by an allocation algorithm called the Balanced Layered
Allocation Scheme (BLAS).
BLAS partitions a connected graph into independent tasks and assigns them to
processors in a multicomputer system. The problem of program allocation lies in
maximizing parallelism while minimizing interprocessor communication costs. Hence,
allocation is based on the best choice of communication to execution ratio for each task.
BLAS utilizes heuristic rules to find a balance between computation and communication
costs in the target system. Here the target architecture is a simulated nCUBE 3E
computer, having a hypercube topology.
Simulations show that, BLAS is effective in reducing the overall execution time of
a program by considering the communication costs on the execution times. The results
will help in understanding the effects in packing nodes (grain-packing), routing issues in
the network and in general, the allocation problem to any processor in a network. In
addition, tasks have also been assigned to adjacent processors only, instead of any
processor on the hypercube network. The adjacent allocation to processors helps to
determine trade-offs required between achieved speed-ups and the time it takes to
completely allocate large graphs on compilation. / Graduation date: 1994
|
378 |
Non-blocking synchronization and system designGreenwald, Michael Barry. January 1900 (has links)
Thesis (Ph.D)--Stanford University, 1999. / Title from PDF t.p. (viewed May 9, 2002). "August 1999." "Adminitrivia V1/Prg/19990826"--Metadata.
|
379 |
Gender and socioeconomic influences on attitudes of fifth-grade students in a mid-size school district toward computersLewis-Brown, Shirley Dean. January 2007 (has links)
Thesis (Ed.D.)--University of West Florida, 2007. / Title from title page of source document. Document formatted into pages; contains 142 pages. Includes bibliographical references.
|
380 |
Low power adiabatic circuits and power clocks for driving adiabatic circuits /Suram, Ragini. January 2003 (has links)
Thesis (M.S.)--University of Missouri-Columbia, 2003. / Typescript. Includes bibliographical references (leaves 132-133). Also available on the Internet.
|
Page generated in 0.0534 seconds