Global ETD Search

451	Experiments in Image Segmentation for Automatic US License Plate Recognition Diaz Acosta, Beatriz 09 July 2004 (has links) License plate recognition/identification (LPR/I) applies image processing and character recognition technology to identify vehicles by automatically reading their license plates. In the United States, however, each state has its own standard-issue plates, plus several optional styles, which are referred to as special license plates or varieties. There is a clear absence of standardization and multi-colored, complex backgrounds are becoming more frequent in license plates. Commercially available optical character recognition (OCR) systems generally fail when confronted with textured or poorly contrasted backgrounds, therefore creating the need for proper image segmentation prior to classification. The image segmentation problem in LPR is examined in two stages: license plate region detection and license plate character extraction from background. Three different approaches for license plate detection in a scene are presented: region distance from eigenspace, border location by edge detection and the Hough transform, and text detection by spectral analysis. The experiments for character segmentation involve the RGB, HSV/HSI and 1976 CIE Lab* color spaces as well as their Karhunen-Loéve transforms. The segmentation techniques applied include multivariate hierarchical agglomerative clustering and minimum-variance color quantization. The trade-off between accuracy and computational expense is used to select a final reliable algorithm for license plate detection and character segmentation. The spectral analysis approach together with the K-L Lab* transformed color quantization are found experimentally as the best alternatives for the two identified image segmentation stages for US license plate recognition. / Master of Science Data Cluster Analysis Hough Transform Spectral Analysis Color Image Segmentation License Plate Recognition Minimum-Variance Quantization
452	Industry 4.0 and Circular Economy for Emerging Markets: Evidence from Small and Medium-Sized Enterprises (SMEs) in the Indian Food Sector Despoudi, S., Sivarajah, Uthayasankar, Spanaki, K., Vincent, Charles, Dura, V.K. 16 May 2023 (has links) Yes / The linear economic business model was deemed unsustainable, necessitating the emergence of the circular economy (CE) business model. Due to resource scarcity, increasing population, and high food waste levels, the food sector has been facing significant sustainability challenges. Small and medium-sized enterprises (SMEs), particularly those in the food sector, are making efforts to become more sustainable and to adopt new business models such as the CE, but adoption rates remain low. Industry 4.0 and its associated technological applications have the potential to enable CE implementation and boost business competitiveness. In the context of emerging economies facing significant resource scarcity constraints and limited technology availability, CE principles need to be adapted. CE could create a new job economy in emerging economies, bringing scale and a competitive advantage. This study explores the enablers of and barriers to Industry 4.0 adoption for CE implementation in fruit and vegetable SMEs in India from a resource-based perspective. The purpose is to develop an evidence-based framework to help inform theory and practice about CE implementation by SMEs in emerging economies. Fifteen semi-structured interviews were conducted with experts in food SMEs. The interview transcripts were first subjected to thematic analysis. The analysis was then complemented with sentiment and emotion analyses. Subsequently, hierarchical cluster analysis, k-means analysis, and linear projection analysis were performed. Among others, the findings suggest that Industry 4.0 plays a key role in implementing CE in SMEs in emerging economies such as India. However, there are specific enablers and barriers that need to be considered by SMEs to develop the resources and capabilities needed for CE competitive advantage. Circular economy Industry 4.0 SMEs Emerging markets Food industry Thematic analysis Sentiment analysis Cluster analysis
453	Quelques propositions pour la comparaison de partitions non strictes / Some proposals for comparison of soft partitions Quéré, Romain 06 December 2012 (has links) Cette thèse est consacrée au problème de la comparaison de deux partitions non strictes (floues/probabilistes, possibilistes) d’un même ensemble d’individus en plusieurs clusters. Sa résolution repose sur la définition formelle de mesures de concordance reprenant les principes des mesures historiques développées pour la comparaison de partitions strictes et trouve son application dans des domaines variés tels que la biologie, le traitement d’images, la classification automatique. Selon qu’elles s’attachent à observer les relations entre les individus décrites par chacune des partitions ou à quantifier les similitudes entre les clusters qui composent ces partitions, nous distinguons deux grandes familles de mesures pour lesquelles la notion même d’accord entre partitions diffère, et proposons d’en caractériser les représentants selon un même ensemble de propriétés formelles et informelles. De ce point de vue, les mesures sont aussi qualifiées selon la nature des partitions comparées. Une étude des multiples constructions sur lesquelles reposent les mesures de la littérature vient compléter notre taxonomie. Nous proposons trois nouvelles mesures de comparaison non strictes tirant profit de l’état de l’art. La première est une extension d’une approche stricte tandis que les deux autres reposent sur des approches dite natives, l’une orientée individus, l’autre orientée clusters, spécifiquement conçues pour la comparaison de partitions non strictes. Nos propositions sont comparées à celles de la littérature selon un plan d’expérience choisi pour couvrir les divers aspects de la problématique. Les résultats présentés montrent l’intérêt des propositions pour le thème de recherche qu’est la comparaison de partitions. Enfin, nous ouvrons de nouvelles perspectives en proposant les prémisses d’un cadre qui unifie les principales mesures non strictes orientées individus. / This thesis is dedicated to the problem of comparing two soft (fuzzy/ probabilistic, possibilistic) partitions of a same set of individuals into several clusters. Its solution stands on the formal definition of concordance measures based on the principles of historical measures developped for comparing strict partitions and can be used invarious fields such as biology, image processing and clustering. Depending on whether they focus on the observation of the relations between the individuals described by each partition or on the quantization of the similarities between the clusters composing those partitions, we distinguish two main families for which the very notion of concordance between partitions differs, and we propose to characterize their representatives according to a same set of formal and informal properties. From that point of view, the measures are also qualified according to the nature of the compared partitions. A study of the multiple constructions on which the measures of the literature lie completes our taxonomy. We propose three new soft comparison measures taking benefits of the state of art. The first one is an extension of a strict approach, while the two others lie on native approaches, one individual-wise oriented, the other cluster-wise, both specifically defined to compare soft partitions. Our propositions are compared to the existing measures of the literature according to a set of experimentations chosen to cover the various issues of the problem. The given results clearly show how relevant our measures are. Finally we open new perspectives by proposing the premises of a new framework unifying most of the individual-wise oriented measures. Comparaison de partitions Indice de Rand Indice de Jaccard Partition floue Partition possibiliste Cluster analysis Contingence-paires Matrice de contingence Matrice de coïncidence Norme triangulaire Comparing partitions Rand index Jaccard index Fuzzy partition Possibilistic partition Cluster analysis Mismatch matrix Contingency matrix Coincidence matrix Triangular norm
454	An analysis of semantic data quality defiencies in a national data warehouse: a data mining approach Barth, Kirstin 07 1900 (has links) This research determines whether data quality mining can be used to describe, monitor and evaluate the scope and impact of semantic data quality problems in the learner enrolment data on the National Learners’ Records Database. Previous data quality mining work has focused on anomaly detection and has assumed that the data quality aspect being measured exists as a data value in the data set being mined. The method for this research is quantitative in that the data mining techniques and model that are best suited for semantic data quality deficiencies are identified and then applied to the data. The research determines that unsupervised data mining techniques that allow for weighted analysis of the data would be most suitable for the data mining of semantic data deficiencies. Further, the academic Knowledge Discovery in Databases model needs to be amended when applied to data mining semantic data quality deficiencies. / School of Computing / M. Tech. (Information Technology) Data warehouse Data mining Data quality mining Exploratory data mining Cluster analysis Association rule Knowledge discovery in databases National Learners’ Records Database Learner enrolment data Semantic data quality deficiencies 005.745 Data warehousing Data mining Cluster analysis Association rule mining
455	Regional Frequency Analysis Of Hydrometeorological Events - An Approach Based On Climate Information Satyanarayana, P 02 1900 (has links) The thesis is concerned with development of efficient regional frequency analysis (RFA) approaches to estimate quantiles of hydrometeorological events. The estimates are necessary for various applications in water resources engineering. The classical approach to estimate quantiles involves fitting frequency distribution to at-site data. However, this approach cannot be used when data at target site are inadequate or unavailable to compute parameters of the frequency distribution. This impediment can be overcome through RFA, in which sites having similar attributes are identified to form a region, and information is pooled from all the sites in the region to estimate the quantiles at target site. The thesis proposes new approaches to RFA of precipitation, meteorological droughts and floods, and demonstrates their effectiveness. The approach proposed for RFA of precipitation overcomes shortcomings of conventional approaches with regard to delineation and validation of homogeneous precipitation regions, and estimation of precipitation quantiles in ungauged and data sparse areas. For the first time in literature, distinction is made between attributes/variables useful to form homogeneous rainfall regions and to validate the regions. Another important issue is that some of the attributes considered for regionalization vary dynamically with time. In conventional approaches, there is no provision to consider dynamic aspects of time varying attributes. This may lead to delineation of ineffective regions. To address this issue, a dynamic fuzzy clustering model (DFCM) is developed. The results obtained from application to Indian summer monsoon and annual rainfall indicated that RFA based on DFCM is more effective than that based on hard and fuzzy clustering models in arriving at rainfall quantile estimates. Errors in quantile estimates for the hard, fuzzy and dynamic fuzzy models based on the proposed approach are shown to be significantly less than those computed for Indian summer monsoon rainfall regions delineated in three previous studies. Overall, RFA based on DFCM and large scale atmospheric variables appeared promising. The performance of DFCM is followed by that of fuzzy and hard clustering models. Next, a new approach is proposed for RFA of meteorological droughts. It is suggested that homogeneous precipitation regions have to be delineated before proceeding to develop drought severity - areal extent - frequency (SAF) curves. Drought SAF curves are constructed at annual and summer monsoon time scales for each of the homogeneous rainfall regions that are newly delineated in India based on the proposed approach. They find use in assessing spatial characteristics and frequency of meteorological droughts. It overcomes shortcomings associated with classical approaches that construct SAF curves for political (e.g., state, country) and physiographic regions (e.g., river basin), based on spatial patterns of at-site values of drought indices in the study area, without testing homogeneity in rainfall. Advantage of the new approach can be noted especially in areas that have significant variations in temporal and spatial distribution of precipitation (possibly due to variations in topography, landscape and climate). The DFCM is extended to RFA of floods, and its effectiveness in prediction of flood quantiles is demonstrated by application to Godavari basin in India, considering precipitation as time varying attribute. Six new homogeneous regions are formed in Godavari basin and errors in quantile estimates based on those regions are shown to be significantly less than those computed based on sub-zones delineated in Godavari basin by Central Water Commission in a previous study. Meteorology Hydrometeorology Precipitation (Meteorology) Floods - Regional Frequency Analysis Meterological Droughts Droughts - Regional Frequency Analysis Rainfall Frequencies Rainfall - India Summer Monsoon Rainfall - India Regional Drought Analysis Hard Cluster Analysis Fuzzy Cluster Analysis Meteorology
456	Regionalization Of Hydrometeorological Variables In India Using Cluster Analysis Bharath, R 09 1900 (has links) (PDF) Regionalization of hydrometeorological variables such as rainfall and temperature is necessary for various applications related to water resources planning and management. Sampling variability and randomness associated with the variables, as well as non-availability and paucity of data pose a challenge in modelling the variables. This challenge can be addressed by using stochastic models that utilize information from hydrometeorologically similar locations for modelling the variables. A set of locations that are hydrometeorologically similar are referred to as homogeneous region or pooling group and the process of identifying a homogeneous region is referred to as regionalization. The thesis concerns development of new approaches to regionalization of (i) extreme rainfall,(ii) maximum and minimum temperatures, and (iii) rainfall together with maximum and minimum temperatures. Regionalization of extreme rainfall and frequency analysis based on resulting regions yields quantile estimates that find use in design of water control (e.g., barrages, dams, levees) and conveyance structures (e.g., culverts, storm sewers, spillways) to mitigate damages that are likely due to floods triggered by extreme rainfall, and land-use planning and management. Regionalization based on both rainfall and temperature yield regions that could be used to address a wide spectrum of problems such as meteorological drought analysis, agricultural planning to cope with water shortages during droughts, downscaling of precipitation and temperature. Conventional approaches to regionalization of extreme rainfall are based extensively on statistics derived from extreme rainfall. Therefore delineated regions are susceptible to sampling variability and randomness associated with extreme rainfall records, which is undesirable. To address this, the idea of forming regions by considering attributes for regionalization as seasonality measure and site location indicators (which could be determined even for ungauged locations) is explored. For regionalization, Global Fuzzy c-means (GFCM) cluster analysis based methodology is developed in L-moment framework. The methodology is used to arrive at a set of 25 homogeneous extreme rainfall regions over India considering gridded rainfall records at daily scale, as there is dearth of regionalization studies on extreme rainfall in India Results are compared with those based on commonly used region of influence (ROI) approach that forms site-specific regions for quantile estimation, but lacks ability to delineate a geographical area into a reasonable number of homogeneous regions. Gridded data constitute spatially averaged rainfall that might originate from a different process (more synoptic) than point rainfall (more convective). Therefore to investigate utility of the developed GFCM methodology in arriving at meaningful regions when applied to point rainfall data, the methodology is applied to daily rainfall records available for 1032 gauges in Karnataka state of India. The application yielded 22 homogeneous extreme rainfall regions. Experiments carried out to examine utility of GFCM and ROI based regions in arriving at quantile estimates for ungauged sites in the study area reveal that performance of GFCM methodology is fairly close to that of ROI approach. Errors were marginally lower in the case of GFCM approach in analysis with observed point rainfall data over Karnataka, while its converse was noted in the case of analysis with gridded rainfall data over India. Neither of the approaches (CA, ROI) was found to be consistent in yielding least error in quantile estimates over all the sites. The existing approaches to regionalization of temperature are based on temperature time series or their related statistics, rather than attributes effecting temperature in the study area. Therefore independent validation of the delineated regions for homogeneity in temperature is not possible. Another drawback of the existing approaches is that they require adequate number of sites with contemporaneous temperature records for regionalization, because the delineated regions are susceptible to sampling variability and randomness associated with the temperature records that are often (i) short in length, (ii) limited over contemporaneous time period and (iii) spatially sparse. To address these issues, a two-stage clustering approach is developed to arrive at regions that are homogeneous in terms of both monthly maximum and minimum temperatures ( and ). First-stage of the approach involves (i) identifying a common set of possible predictors (LSAVs) influencing and over the entire study area, and (ii) using correlations of those predictors with and along with location indicators (latitude, longitude and altitude) as the basis to delineate sites in the study area into hard clusters through global k-means clustering algorithm. The second stage involves (i) identifying appropriate LSAVs corresponding to each of the first-stage clusters, which could be considered as potential predictors, and (ii) using the potential predictors along with location indicators (latitude, longitude and altitude) as the basis to partition each of the first-stage clusters into homogeneous temperature regions through global fuzzy c-means clustering algorithm. A set of 28 homogeneous temperature regions was delineated over India using the proposed approach. Those regions are shown to be effective when compared to an existing set of 6 temperature regions over India for which inter-site cross-correlations were found to be weak and negative for several months, which is undesirable. Effectiveness of the newly formed regions is demonstrated. Utility of the proposed maxTminT homogeneous temperature regions in arriving at PET estimates for ungauged locations within the study area was demonstrated. The estimates were found to be better when compared to those based on the existing regions. The existing approaches to regionalization of hydrometeorological variables are based on principal components (PCs)/ statistics/indices determined from time-series of those variables at monthly and seasonal scale. An issue with use of PCs for regionalization is that they have to be extracted from contemporaneous records of hydrometeorological variables. Therefore delineated regions may not be effective when the available records are limited over contemporaneous time period. A drawback associated with the use of statistics/indices is that they (i) may not be meaningful when data exhibit nonstationarity and (ii) do not encompass complete information in the original time series. Consequently the resulting regions may not be effective for the desired purpose. To address these issues, a new approach is proposed. It considers information extracted from wavelet transformations of the observed multivariate hydrometeorological time series as the basis for regionalization by global fuzzy c-means clustering procedure. The approach can account for dynamic variability in the time series and its nonstationarity (if any). Effectiveness of the proposed approach in forming homogeneous hydrometeorological regions is demonstrated by application to India, as there are no prior attempts to form such regions over the country. The investigations resulted in identification of 29 regions over India, which are found to be effective and meaningful. Drought Severity-Area-Frequency (SAF) curves are developed for each of the newly formed regions considering the drought index to be Standardized Precipitation Evapotranspiration Index (SPEI). Hydrometeorological Variables - India Hydrological Forecasting Cluster Analysis Extreme Rainfall Regionalization Temperature Regionalization Homogeneous Temperature Regions Hydrometeorological Regions Delineation Homogeneous Hydrometeorological Regions Meteorology
457	Analýza vlastností shlukovacích algoritmů / Analysis of Clustering Methods Lipták, Šimon January 2019 (has links) The aim of this master's thesis was to get acquainted with cluster analysis, clustering methods and their theoretical properties. It was necessary select clustering algorithms whose properties will be analyzed, find and select data sets on which these algorithms will be triggered. Also, the goal was to design and implement an application that will evaluate and display clustering results in an appropriate manner. The last step was to analyze the results and compare them with theoretical assumptions.
458	Combining Multivariate Statistical Methods and Spatial Analysis to Characterize Water Quality Conditions in the White River Basin, Indiana, U.S.A. Gamble, Andrew Stephan 25 February 2011 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This research performs a comparative study of techniques for combining spatial data and multivariate statistical methods for characterizing water quality conditions in a river basin. The study has been performed on the White River basin in central Indiana, and uses sixteen physical and chemical water quality parameters collected from 44 different monitoring sites, along with various spatial data related to land use – land cover, soil characteristics, terrain characteristics, eco-regions, etc. Various parameters related to the spatial data were analyzed using ArcHydro tools and were included in the multivariate analysis methods for the purpose of creating classification equations that relate spatial and spatio-temporal attributes of the watershed to water quality data at monitoring stations. The study compares the use of various statistical estimates (mean, geometric mean, trimmed mean, and median) of monitored water quality variables to represent annual and seasonal water quality conditions. The relationship between these estimates and the spatial data is then modeled via linear and non-linear multivariate methods. The linear statistical multivariate method uses a combination of principal component analysis, cluster analysis, and discriminant analysis, whereas the non-linear multivariate method uses a combination of Kohonen Self-Organizing Maps, Cluster Analysis, and Support Vector Machines. The final models were tested with recent and independent data collected from stations in the Eagle Creek watershed, within the White River basin. In 6 out of 20 models the Support Vector Machine more accurately classified the Eagle Creek stations, and in 2 out of 20 models the Linear Discriminant Analysis model achieved better results. Neither the linear or non-linear models had an apparent advantage for the remaining 12 models. This research provides an insight into the variability and uncertainty in the interpretation of the various statistical estimates and statistical models, when water quality monitoring data is combined with spatial data for characterizing general spatial and spatio-temporal trends. multivariate statistics cluster analysis support vector machine Kohonen self-organizing map linear discriminant analysis principal component analysis water quality White River Watershed (Ind.) Water quality -- Measurement Multivariate analysis Cluster analysis
459	Clustering Louisiana commercial fishery participants for the allocation of government disaster payment: the case of hurricanes Katrina and Rita Ogunyinka, Ebenezer Oluwayomi January 1900 (has links) Master of Science / Department of Statistics / John E. Boyer Jr / The purpose of this study is to evaluate the effectiveness of the methods used for allocating disaster funds to assist commercial fishery participants as a result of Hurricanes Katrina and Rita of 2005 and to examine alternative methods to aid in determining an efficient criterion for allocating public funds for fisheries assistance. The trip ticket data managed by the Louisiana Department of Wildlife and Fisheries were used and analyzed using a cluster analysis. Results from the clustering procedures show that commercial fishermen consist of seven clusters, while wholesale/retail seafood dealers consist of six clusters. The three tiers into which commercial fishermen were originally classified can be extended to at least eleven (11) clusters, made up of three (3) clusters in tier 1 and an equal number of clusters (4) clusters in tier 2 and tier 3. Similarly, the original three tiers of wholesale/retail seafood dealers can be reclassified into at least nine (9) clusters with two clusters in tier 1, four (4) clusters in tier 2 and three (3) clusters in tier 3. As a result of the clustering reclassifications, alternative compensation plans were developed for the commercial fishermen and wholesale/retail seafood dealers. These alternative compensation plans suggest a reallocation of disaster assistance funds among individual groups of fishermen and among individual groups of dealers. We finally recommend that alternative classification methods should always be considered in order to select the most efficient criterion for allocating public funds in the future. Hurricanes Katrina and Rita Cluster Analysis Fishery participants Disaster payment allocation methods Fisheries and Aquatic Sciences (0792) Natural Resource Management (0528)
460	臺北市都市容積空間分佈與容積移轉制度研究 / A Study on the Distribution of Urban Building Capacity and the Urban Building Capacity Transfer in Taipei City 鄭于玲 Unknown Date (has links) 我國「都市計畫容積移轉辦法」中規定容積接受區須與容積送出區為同一都市計畫地區，以臺北市為例，全區屬同一主要計畫，申請者於房價較低之行政區取得公共設施用地後，將其容積移入房價較高之行政區，產生某些行政區開發量暴增、公共設施服務品質不佳、都市景觀衝擊、都市交通擁擠及缺乏最適容積總量管制等課題。本研究以都市容受力理論觀點，提出最適容積總量的評估機制，利用ArcGIS系統了解臺北市各行政區及街廓發展現況，掌握其各種屬性，作為建立分類指標參考，再藉由因子分析、集群分析等研究方法，進行都市容受力評估以及街廓條件分類。最後，根據實證研究結果，建議臺北市容積移轉制度須檢討各行政區容積發展總量、評估都市發展容受力、考量其他容積管制政策、調整容積移轉範圍、規定各行政區接受基地街廓分類與移入容積上限及擬定容積移轉接受基地相關規範，以作為容積移轉政策之執行參考。 / As stipulated in the Regulations of Urban Building Capacity Transfer, the recipient and donor of transferred building capacity need to be located in the same urban planned district. Take Taipei City for example, it is covered in the same urban planned area. As developers acquire public facilities lands in administrative areas of low land prices, and then transfer the building capacity got from the former to the administrative areas of high land prices. The behavior of developers mentioned above, will trigger a rapid increase of development capacity in certain administrative areas, impact the service quality of public facilities, urban landscape, traffic condition, and optimal control of building capacity. Based on the theory of urban carrying capacity, the study proposes an optimal building capacity assessment mechanism. ArcGIS is used to trace the current development and understand the various features of the administrative areas and street blocks in Taipei City so as to help establish relevant classification indicators. Factor and cluster analyses are then conducted to facilitate urban carrying capacity assessment and classification of street blocks. Finally, on the basis of the empirical results , this study offers the following suggestions for achieving more effective implementation of the urban building capacity transfer system in Taipei City: reviewing the total building capacity in every administrative areas, assessing urban carrying capacity, exploring other policies for building capacity control, adjusting the scope of building capacity transfer, classifying the street blocks of the recipient sites in every administrative areas, limiting the volume of transferred building capacity, and drafting regulations for recipient sites of transferred building capacity. 容積移轉因子分析集群分析 Urban building capacity transfer factory analysis cluster analysis

Search results