• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 129
  • 19
  • 11
  • 11
  • 4
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 281
  • 281
  • 123
  • 100
  • 70
  • 62
  • 53
  • 39
  • 38
  • 36
  • 35
  • 33
  • 32
  • 31
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Analyzing collaboration with large-scale scholarly data

Zuo, Zhiya 01 August 2019 (has links)
We have never stopped in the pursuit of science. Standing on the shoulders of the giants, we gradually make our path to build a systematic and testable body of knowledge to explain and predict the universe. Emerging from researchers’ interactions and self-organizing behaviors, scientific communities feature intensive collaborative practice. Indeed, the era of lone genius has long gone. Teams have now dominated the production and diffusion of scientific ideas. In order to understand how collaboration shapes and evolves organizations as well as individuals’ careers, this dissertation conducts analyses at both macroscopic and microscopic levels utilizing large-scale scholarly data. As self-organizing behaviors, collaborations boil down to the interactions among researchers. Understanding collaboration at individual level, as a result, is in fact a preliminary and crucial step to better understand the collective outcome at group and organization level. To start, I investigate the role of research collaboration in researchers’ careers by leveraging person-organization fit theory. Specifically, I propose prospective social ties based on faculty candidates’ future collaboration potential with future colleagues, which manifests diminishing returns on the placement quality. Moving forward, I address the question of how individual success can be better understood and accurately predicted utilizing their collaboration experience data. Findings reveal potential regularities in career trajectories for early-stage, mid-career, and senior researchers, highlighting the importance of various aspects of social capital. With large-scale scholarly data, I propose a data-driven analytics approach that leads to a deeper understanding of collaboration for both organizations and individuals. Managerial and policy implications are discussed for organizations to stimulate interdisciplinary research and for individuals to achieve better placement as well as short and long term scientific impact. Additionally, while analyzed in the context of academia, the proposed methods and implications can be generalized to knowledge-intensive industries, where collaboration are key factors to performance such as innovation and creativity.
22

Tools and theory to improve data analysis

Grolemund, Garrett 24 July 2013 (has links)
This thesis proposes a scientific model to explain the data analysis process. I argue that data analysis is primarily a procedure to build un- derstanding and as such, it dovetails with the cognitive processes of the human mind. Data analysis tasks closely resemble the cognitive process known as sensemaking. I demonstrate how data analysis is a sensemaking task adapted to use quantitative data. This identification highlights a uni- versal structure within data analysis activities and provides a foundation for a theory of data analysis. The model identifies two competing chal- lenges within data analysis: the need to make sense of information that we cannot know and the need to make sense of information that we can- not attend to. Classical statistics provides solutions to the first challenge, but has little to say about the second. However, managing attention is the primary obstacle when analyzing big data. I introduce three tools for managing attention during data analysis. Each tool is built upon a different method for managing attention. ggsubplot creates embedded plots, which transform data into a format that can be easily processed by the human mind. lubridate helps users automate sensemaking out- side of the mind by improving the way computers handle date-time data. Visual Inference Tools develop expertise in young statisticians that can later be used to efficiently direct attention. The insights of this thesis are especially helpful for consultants, applied statisticians, and teachers of data analysis.
23

Smart Urban Metabolism : Toward a New Understanding of Causalities in Cities

Shahrokni, Hossein January 2015 (has links)
For half a century, urban metabolism has been used to provide insights to support transitions to sustainable urban development (SUD). Internet and Communication Technology (ICT) has recently been recognized as a potential technology enabler to advance this transition. This thesis explored the potential for an ICT-enabled urban metabolism framework aimed at improving resource efficiency in urban areas by supporting decision-making processes. Three research objectives were identified: i) investigation of how the urban metabolism framework, aided by ICT, could be utilized to support decision-making processes; ii) development of an ICT platform that manages real-time, high spatial and temporal resolution urban metabolism data and evaluation of its implementation; and iii) identification of the potential for efficiency improvements through the use of resulting high spatial and temporal resolution urban metabolism data. The work to achieve these objectives was based on literature reviews, single-case study research in Stockholm, software engineering research, and big data analytics of resulting data. The evolved framework, Smart Urban Metabolism (SUM), enabled by the emerging context of smart cities, operates at higher temporal (up to real-time), and spatial (up to household/individual) data resolution. A key finding was that the new framework overcomes some of the barriers identified for the conventional urban metabolism framework. The results confirm that there are hidden urban patterns that may be uncovered by analyzing structured big urban data. Some of those patterns may lead to the identification of appropriate intervention measures for SUD. / <p>QC 20151120</p> / Smart City SRS
24

k-Nearest Neighbour Classification of Datasets with a Family of Distances

Hatko, Stan January 2015 (has links)
The k-nearest neighbour (k-NN) classifier is one of the oldest and most important supervised learning algorithms for classifying datasets. Traditionally the Euclidean norm is used as the distance for the k-NN classifier. In this thesis we investigate the use of alternative distances for the k-NN classifier. We start by introducing some background notions in statistical machine learning. We define the k-NN classifier and discuss Stone's theorem and the proof that k-NN is universally consistent on the normed space R^d. We then prove that k-NN is universally consistent if we take a sequence of random norms (that are independent of the sample and the query) from a family of norms that satisfies a particular boundedness condition. We extend this result by replacing norms with distances based on uniformly locally Lipschitz functions that satisfy certain conditions. We discuss the limitations of Stone's lemma and Stone's theorem, particularly with respect to quasinorms and adaptively choosing a distance for k-NN based on the labelled sample. We show the universal consistency of a two stage k-NN type classifier where we select the distance adaptively based on a split labelled sample and the query. We conclude by giving some examples of improvements of the accuracy of classifying various datasets using the above techniques.
25

Enabling statistical analysis of the main ionospheric trough with computer vision

Starr, Gregory Walter Sidor 25 September 2021 (has links)
The main ionospheric trough (MIT) is a key density feature in the mid-latitude ionosphere and characterizing its structure is important for understanding GPS radio signal scintillation and HF wave propagation. While a number of previous studies have statistically investigated the properties of the trough, they have only examined its latitudinal cross sections, and have not considered the instantaneous two-dimensional structure of the trough. In this work, we developed an automatic optimization-based method for identifying the trough in Total Electron Content (TEC) maps and quantified its agreement with the algorithm developed in (Aa et al., 2020). Using the newly developed method, we created a labeled dataset and statistically examined the two-dimensional structure of the trough. Specifically, we investigated how Kp affects the trough’s occurrence probability at different local times. At low Kp, the trough tends to form in the postmidnight sector, and with increasing Kp, the trough occurrence probability increases and shifts premidnight. We explore the possibility that this is due to increased occurrence of troughs formed by subauroral polarization streams (SAPS). Additionally, using SuperDARN convection maps and solar wind data, we characterized the MIT's dependence on the interplanetary magnetic field (IMF) clock angle.
26

Reliability of Commercially Relevant Photovoltaic Cell and Packaging Combinations in Accelerated and Outdoor Environments

Curran, Alan J. 30 August 2021 (has links)
No description available.
27

Understanding the phenomenon of Neural Collapse

Mokkapati, Siva January 2022 (has links)
In this paper, we try to understand the concept of ’Neural Collapse’ from a mathemati-cal point of view. The survey will be conducted based on [1]. The authors of [1] providea first global optimization landscape analysis of Neural Collapse. Mainly there are threeaspects the authors like to investigate. The first is to add the weight decay on classicalcross-entropy loss to show that the global minimizers are the simplex ETF based onanalysing the Hessian. Secondly, the ’Layer-peeled’ network still preserves the im-portant features of the full network. In other words even simplifying the loss functionthe network does not lose its explainability. Lastly, how the Layer-peeled network canreduce the memory costs and generalization is as good as the full network. Our studydelves into these details on, how the simplified network is defined? How this simplifiednetwork is different from the original network in terms of the loss function, and finallywe understand the theory behind these steps. We also conduct numerical analysis onspecific input, observe and analyze this phenomenon and finally report our results.
28

Increasing the Predictive Potential of Machine Learning Models for Enhancing Cybersecurity

Ahsan, Mostofa Kamrul January 2021 (has links)
Networks have an increasing influence on our modern life, making Cybersecurity an important field of research. Cybersecurity techniques mainly focus on antivirus software, firewalls and intrusion detection systems (IDSs), etc. These techniques protect networks from both internal and external attacks. This research is composed of three different essays. It highlights and improves the applications of machine learning techniques in the Cybersecurity domain. Since the feature size and observations of the cyber incident data are increasing with the growth of internet usage, conventional defense strategies against cyberattacks are getting invalid most of the time. On the other hand, the applications of machine learning tasks are getting better consistently to prevent cyber risks in a timely manner. For the last decade, machine learning and Cybersecurity have converged to enhance risk elimination. Since the cyber domain knowledge and adopting machine learning techniques do not align on the same page in the case of deployment of data-driven intelligent systems, there are inconsistencies where it is needed to bridge the gap. We have studied the most recent research works in this field and documented the most common issues regarding the implementation of machine learning algorithms in Cybersecurity. According to these findings, we have conducted research and experiments to improve the quality of service and security strength by discovering new approaches.
29

Degradation of Photovoltaic Packaging Materials and Power Output of Photovoltaic Systems: Scaling up Materials Science with Data Science

Wang, Menghong 07 September 2020 (has links)
No description available.
30

Unsupervised Dimension Reduction Techniques for Lung Diagnosis using Radiomics

Kireta, Janet 01 May 2023 (has links) (PDF)
Over the years, cancer has increasingly become a global health problem [12]. For successful treatment, early detection and diagnosis is critical. Radiomics is the use of CT, PET, MRI or Ultrasound imaging as input data, extracting features from image-based data, and then using machine learning for quantitative analysis and disease prediction [23, 14, 19, 1]. Feature reduction is critical as most quantitative features can have unnecessary redundant characteristics. The objective of this research is to use machine learning techniques in reducing the number of dimensions, thereby rendering the data manageable. Radiomics steps include Imaging, segmentation, feature extraction, and analysis. For this research, a large-scale CT data for Lung cancer diagnosis collected by scholars from Medical University in China is used to illustrate the dimension reduction techniques via R, SAS, and Python softwares. The proposed reduction and analysis techniques were PCA, Clustering, and Manifold-based algorithms. The results indicated the texture-based features

Page generated in 0.0553 seconds