1 |
Analysis of Remarks Using Clustering and Keyword Extraction : Clustering Remarks on Electrical Installations and Identifying the Clusters by Extracting Keywords / Analys av anmärkningar med hjälp av klustring och extrahering av nyckelord : Klustring av anmärkningar på elektriska installationer och identifiering av klustren med hjälp av extrahering av nyckelordStiff, Philip January 2018 (has links)
Nowadays it is common for companies to sit on and gather a lot of data related to their business. The size of this data is often too large to be analyzed by hand and it is therefore becoming more and more common to automate this analysis e.g. by running machine learning methods on this data. In this project we attempt at analyzing an unstructured dataset consisting of remarks, found by inspectors, on electrical installations. This is done by firstly clustering the dataset with the goal of having each cluster representing a specific type of error found in the field and then extracting ten keywords from each cluster. We investigate whether these keywords can be used for representing the clusters’ contents in a way that could be useful for a future end-user application. The solution developed in this project was evaluated by constructing a form where the respondents were shown example remarks from a random subset of clusters and got to evaluate both how well the extracted keywords matched the examples and to what degree the example remarks from the same cluster represented the same kind of error. We got a total of 22 responses consisting of 8 professional inspectors and 14 laymen. Our results show that the keyword extraction make sense in connection to the example remarks from the form and that the keywords show promise in describing the content of a cluster. Also, for a majority of the clusters a clear consensus can be seen between the respondents on what keywords they considered as relevant. However the average number of keywords that the respondents considered relevant for each remark (1.40) was deemed too low for us to be able to recommend the solution. Additionally the clustering quality follows the same pattern in showing promise but not quite giving satisfactory results in this study. For future work a larger study should be conducted where several combinations of clustering and keyword extraction methods could be evaluated more thoroughly to be able to draw more decisive conclusions. / Nuförtiden är det vanligt att företag samlar in och sitter på en mängd data kopplad till sin verksamhet. Denna datamängd är ofta för stor för att kunna analyseras för hand. Därför har det blivit allt vanligare att automatisera denna analys genom att köra maskininlärningsmetoder på datan. I detta projekt analyseras ett dataset bestående av fritext-poster innehållande anmärkningar på elinstallationer. Detta görs genom att först klustra datan med målet att varje kluster ska representera en viss typ av anmärkning från fältet för att sedan extrahera 10 st nyckelord från varje kluster. Vår undersökning går sedan ut på att undersöka till vilken grad dessa nyckelord kan sägas representera klustrens innehåll på ett sätt som skulle vara användbart för en applikation för slutanvändare. Den lösning som togs fram i projektet utvärderades genom en enkät där de svarande visades exempel på anmärkningar från ett antal slumpvist valda kluster och sedan fick ta ställning till hur väl nyckelorden passade in på exemplen och också till vilken grad exemplen från samma kluster representerade samma typ av anmärkning. Totalt fick vi in svar från 22 personer, nämligen 8 besiktningsingenjörer och 14 st lekmän. Resultaten visar att de extraherade nyckelorden hade en naturlig koppling till de respektive anmärkningarna från enkäten och att de har potential att förklara innehållet i klustren. Hos en majoritet av klustern kunde vi också se en tydlig samstämmighet bland de svarande i vilka specifika nyckelord som ansågs relevanta. Dock var det genomsnittliga antalet nyckelord som ansågs relevanta för ett anmärkningsexempel (1,40) för lågt för att vi ska kunna rekommendera den utvärderade lösningen. På ett liknande sätt visar våra resultat att klustringen av datan var lovande, men att den inte blev helt tillfredsställande. I ett fortsatt arbete borde en större undersökning göras där flera kombinationer av metoder för klustring och extrahering av nyckelord jämförs grundligare så att säkrare slutsatser kan dras.
|
2 |
The impact of task specification on code generated via ChatGPTLundblad, Jonathan, Thörn, Edwin, Thörn, Linus January 2023 (has links)
ChatGPT has made large language models more accessible and made it possible to code using natural language prompts. This study conducted an experiment comparing prompt engineering techniques called task specification and investigated their impacton code generation in terms of correctness and variety. The hypotheses of this study focused on whether the baseline method had a statistically significant difference in code correctness compared to the other methods. Code is evaluated using a software requirement specification that measures functional and syntactical correctness. Additionally, code variance is measured to identify patterns in code generation. The results show that there is a statistically significant difference in some code correctness criteria between the baseline and the other task specification methods, and the code variance measurements indicate a variety in the generated solutions. Future work could include using another large language model; different programming tasks andprogramming languages; and other prompt engineering techniques.
|
3 |
A study of human-robot interaction with an assistive robot to help people with severe motor impairmentsChoi, Young Sang 06 July 2009 (has links)
The thesis research aims to further the study of human-robot interaction (HRI) issues, especially regarding the development of an assistive robot designed to help individuals possessing motor impairments. In particular, individuals with amyotrophic lateral sclerosis (ALS), represent a potential user population that possess an array of motor impairment due to the progressive nature of the disease. Through review of the literature, an initial target for robotic assistance was determined to be object retrieval and delivery tasks to aid with dropped or otherwise unreachable objects, which represent a common and significant difficulty for individuals with limited motor capabilities. This thesis research has been conducted as part of a larger, collaborative project between the Georgia Institute of Technology and Emory University. To this end, we developed and evaluated a semi-autonomous mobile healthcare service robot named EL-E. I conducted four human studies involving patients with ALS with the following objectives: 1) to investigate and better understand the practical, everyday needs and limitations of people with severe motor impairments; 2) to translate these needs into pragmatic tasks or goals to be achieved through an assistive robot and reflect these needs and limitations into the robot's design; 3) to develop practical, usable, and effective interaction mechanisms by which the impaired users can control the robot; and 4) and to evaluate the performance of the robot and improve its usability. I anticipate that the findings from this research will contribute to the ongoing research in the development and evaluation of effective and affordable assistive manipulation robots, which can help to mitigate the difficulties, frustration, and lost independence experienced by individuals with significant motor impairments and improve their quality of life.
|
4 |
Recurrent neural network language generation for dialogue systemsWen, Tsung-Hsien January 2018 (has links)
Language is the principal medium for ideas, while dialogue is the most natural and effective way for humans to interact with and access information from machines. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact on usability and perceived quality. Many commonly used NLG systems employ rules and heuristics, which tend to generate inflexible and stylised responses without the natural variation of human language. However, the frequent repetition of identical output forms can quickly make dialogue become tedious for most real-world users. Additionally, these rules and heuristics are not scalable and hence not trivially extensible to other domains or languages. A statistical approach to language generation can learn language decisions directly from data without relying on hand-coded rules or heuristics, which brings scalability and flexibility to NLG. Statistical models also provide an opportunity to learn in-domain human colloquialisms and cross-domain model adaptations. A robust and quasi-supervised NLG model is proposed in this thesis. The model leverages a Recurrent Neural Network (RNN)-based surface realiser and a gating mechanism applied to input semantics. The model is motivated by the Long-Short Term Memory (LSTM) network. The RNN-based surface realiser and gating mechanism use a neural network to learn end-to-end language generation decisions from input dialogue act and sentence pairs; it also integrates sentence planning and surface realisation into a single optimisation problem. The single optimisation not only bypasses the costly intermediate linguistic annotations but also generates more natural and human-like responses. Furthermore, a domain adaptation study shows that the proposed model can be readily adapted and extended to new dialogue domains via a proposed recipe. Continuing the success of end-to-end learning, the second part of the thesis speculates on building an end-to-end dialogue system by framing it as a conditional generation problem. The proposed model encapsulates a belief tracker with a minimal state representation and a generator that takes the dialogue context to produce responses. These features suggest comprehension and fast learning. The proposed model is capable of understanding requests and accomplishing tasks after training on only a few hundred human-human dialogues. A complementary Wizard-of-Oz data collection method is also introduced to facilitate the collection of human-human conversations from online workers. The results demonstrate that the proposed model can talk to human judges naturally, without any difficulty, for a sample application domain. In addition, the results also suggest that the introduction of a stochastic latent variable can help the system model intrinsic variation in communicative intention much better.
|
Page generated in 0.106 seconds