91 |
A Nonlinear Mixture Autoregressive Model For Speaker VerificationSrinivasan, Sundararajan 30 April 2011 (has links)
In this work, we apply a nonlinear mixture autoregressive (MixAR) model to supplant the Gaussian mixture model for speaker verification. MixAR is a statistical model that is a probabilistically weighted combination of components, each of which is an autoregressive filter in addition to a mean. The probabilistic mixing and the datadependent weights are responsible for the nonlinear nature of the model. Our experiments with synthetic as well as real speech data from standard speech corpora show that MixAR model outperforms GMM, especially under unseen noisy conditions. Moreover, MixAR did not require delta features and used 2.5x fewer parameters to achieve comparable or better performance as that of GMM using static as well as delta features. Also, MixAR suffered less from overitting issues than GMM when training data was sparse. However, MixAR performance deteriorated more quickly than that of GMM when evaluation data duration was reduced. This could pose limitations on the required minimum amount of evaluation data when using MixAR model for speaker verification.
|
92 |
Non-Gaussian Mixture Model Averaging for ClusteringZhang, Xu Xuan January 2017 (has links)
The Gaussian mixture model has been used for model-based clustering analysis for
decades. Most model-based clustering analyses are based on the Gaussian mixture
model. Model averaging approaches for Gaussian mixture models are proposed by
Wei and McNicholas, based on a family of 14 Gaussian parsimonious clustering
models. In this thesis, we use non-Gaussian mixture
models, namely the tEigen family, for our averaging approaches. This paper studies
fitting in an averaged model from a set of multivariate t-mixture models instead of
fitting a best model. / Thesis / Master of Science (MSc)
|
93 |
AN AUTOMATIC CALIBRATION STRATEGY FOR 3D FE BRIDGE MODELSLIU, LEI 05 October 2004 (has links)
No description available.
|
94 |
Prosim VII: An enhanced production simulation modelAlexander, Louis Cadmon January 1992 (has links)
No description available.
|
95 |
Model Robust Regression Based on Generalized Estimating EquationsClark, Seth K. 04 April 2002 (has links)
One form of model robust regression (MRR) predicts mean response as a convex combination of a parametric and a nonparametric prediction. MRR is a semiparametric method by which an incompletely or an incorrectly specified parametric model can be improved through adding an appropriate amount of a nonparametric fit. The combined predictor can have less bias than the parametric model estimate alone and less variance than the nonparametric estimate alone. Additionally, as shown in previous work for uncorrelated data with linear mean function, MRR can converge faster than the nonparametric predictor alone. We extend the MRR technique to the problem of predicting mean response for clustered non-normal data. We combine a nonparametric method based on local estimation with a global, parametric generalized estimating equations (GEE) estimate through a mixing parameter on both the mean scale and the linear predictor scale. As a special case, when data are uncorrelated, this amounts to mixing a local likelihood estimate with predictions from a global generalized linear model. Cross-validation bandwidth and optimal mixing parameter selectors are developed. The global fits and the optimal and data-driven local and mixed fits are studied under no/some/substantial model misspecification via simulation. The methods are then illustrated through application to data from a longitudinal study. / Ph. D.
|
96 |
A Comparison of Discrete and Continuous Survival AnalysisKim, Sunha 08 May 2014 (has links)
There has been confusion in choosing a proper survival model between two popular survival models of discrete and continuous survival analysis. This study aimed to provide empirical outcomes of two survival models in educational contexts and suggest a guideline for researchers who should adopt a suitable survival model. For the model specification, the study paid attention to three factors of time metrics, censoring proportions, and sample sizes. To arrive at comprehensive understanding of the three factors, the study investigated the separate and combined effect of these factors. Furthermore, to understand the interaction mechanism of those factors, this study examined the role of the factors to determine hazard rates which have been known to cause the discrepancies between discrete and continuous survival models. To provide empirical evidence from different combinations of the factors in the use of survival analysis, this study built a series of discrete and continuous survival models using secondary data and simulated data. In the first study, using empirical data from the National Longitudinal Survey of Youth 1997 (NLSY97), this study compared analyses results from the two models having different sizes of time metrics. In the second study, by having various specifications with combination of two other factors of censoring proportions and sample sizes, this study simulated datasets to build two models and compared the analysis results. The major finding of the study is that discrete models are recommended in the conditions of large units of time metrics, low censoring proportion, or small sample sizes. Particularly, discrete model produced better outcomes for conditions with low censoring proportion (20%) and small number (i.e., four) of large time metrics (i.e., year) regardless of sample sizes. Close examination of those conditions of time metrics, censoring proportion, and sample sizes showed that the conditions resulted into high hazards (i.e., 0.20). In conclusion, to determine a proper model, it is recommended to examine hazards of each of the time units with the specific factors of time metrics, censoring proportion and sample sizes. / Ph. D.
|
97 |
Modeling Fecal Indicator Bacteria and Antibiotic Resistance in Diverse Aquatic EnvironmentsHouse, Gregory Richard 13 January 2021 (has links)
The detrimental influence of humans on the environment is of increasing concern. Humans, their livestock, and their pets have caused fecal contamination of waterways throughout the United States. Understanding the sources of fecal indicator bacteria (FIB) and the environmental processes that affect them can be crucial to reducing the number of impaired streams and limiting the negative impacts on the environment. Antibiotic resistance is an emerging issue facing human health in the United States and across the world. Antibiotic resistant bacteria (ARB) have antibiotic resistance genes (ARGs) that prevent antibiotics from killing them. Limited research has been done on the role of the environment in the propagation of antibiotic resistance. As the use of antibiotics increases, it is critical to examine how this impacts human health through the environment.
Models of watersheds in Patillas, Puerto Rico and Christiansburg, Virginia were created using the Soil and Water Assessment Tool (SWAT) to compare how the differences in spatial and temporal sampling of FIB, climate, and population affect FIB movement. The performances of the calibrated bacteria models were comparable to other published studies. A primary challenge faced in this study was the use of grab samples taken months apart as monthly averages of FIB. The high precipitation and constant warm climate made the model for Patillas more difficult to fit because of the high variability in the observed data. While the Patillas watershed had a lower population of people and livestock, the Christiansburg watershed had more available data on wildlife. The lack of spatial variance of data and the use of data from 1993-2018, hindered the ability for the model for Patillas to model FIB. Additionally, the model's performance was limited due to the strong hurricanes that affect land use, soils, and populations of humans and animals in the watershed. Using open-source data needs to be explored further as a faster and more cost-effective way of developing SWAT FIB models.
The feasibility to use data collected in the Christiansburg and Patillas watershed to calibrate a SWAT-ARB model was determined based on available ARG data. The results indicate that the bacteria models need to be improved before an effective SWAT-ARB model can be calibrated. One limitation in the available ARG data for the two watersheds was that they were only sampled once. Out of the ARGs sampled, sul1 was the best modeled in both watersheds because it has the highest normalized values and correlated with the amount of developed land. / Master of Science / Humans negatively impact the environment. Humans and animals contribute to the bacteria contamination of waterways. Investigation into where the contamination sources are and environmental processes that contribute can help researchers limit the impact on the environment. Bacteria can build resistance to antibiotics, which can be especially dangerous to humans and livestock when exposed. Little research has been done on how the environment has contributed to the spread of antibiotic resistance in bacteria.
The Soil and Water Assessment Tool (SWAT) was used to investigate bacteria in the Patillas, Puerto Rico and Christiansburg, Virginia watershed. These models used data published by the United States Geological Survey (USGS) and Environmental Protection Agency (EPA) to improve performance. When comparing simulated data to observed data, the performances of the models were comparable to other published studies. The Patillas watershed was particularly difficult to model because of the warm climate and high precipitation that caused high variability in bacteria concentrations. Strong weather events including hurricanes and a lack of available data on wildlife were other hinderances to the Patillas model. In comparison, more published data on wildlife was available in the Christiansburg watershed and it had a more temperate climate.
The SWAT-ARB model was reviewed and recommendations were made to improve the model. Using the previously collected antibiotic resistance bacteria data in the Christiansburg and Patillas watersheds, it would be impossible to create accurate models. More antibiotic resistance data needs to be taken across as a greater time period before the performance of the models can be assessed.
|
98 |
On Demand Mobility Commuter Aircraft Demand EstimationSyed, Nida Umme-Saleem 12 September 2017 (has links)
On-Demand Mobility (ODM) is a concept to address congestion problems. Using electric aircraft and vertical take-off with limited landing (VTOL) capabilities, the ODM concept offers on demand transportation service between designated landing sites at a fraction of driving time. The purpose of this research is to estimate the potential ODM demand and understand the challenges of introducing ODM using the Northern California region (including major cities like San Francisco, Sacramento, and San Jose) as an area of study and a second, less rigorous analysis for the Washington-Baltimore region. A conditional logit model was developed to estimate mode choice behavior and to estimate ODM demand; presenting automobile and public transportation as the two competing modes to ODM.
There are significant challenges associated with the service including ability to operate in bad weather, vehicle operating cost, siting and cost of landing sites, and overall public acceptance of small, remotely operated aircraft.
Nine scenarios were run varying the input for a base fare, landing fare, cost per-passenger-mile, auto operational costs, and ingress (waiting) times. The results yielded sensitivity of demand to all these parameters and especially showed a great difference in demand when auto costs were decreased from the standard American Automobile Association (AAA) cost per mile to a likely, future auto operating cost. The challenge that aerospace engineers face is designing an aircraft capable of achieving lower operational costs. The results showed that in order for the ODM to be a competitive mode, the cost per passenger-mile should be kept at $1. / Master of Science / On-Demand Mobility (ODM) is a concept to address congestion problems. Using an electric propulsion aircraft, the ODM concept offers on demand transportation service between designated landing sites at a fraction of driving time; an “air taxi” or “air Uber” as coined by media outlets. The purpose of this research is to estimate the potential ODM demand and understand the challenges of introducing ODM using the Northern California region (including major cities like San Francisco, Sacramento, and San Jose) as an area of study and a second, less rigorous analysis for the Washington-Baltimore region. A model was developed to estimate mode choice behavior and to estimate ODM demand based on existing travel behavior and patterns in the Northern California region.
There are significant challenges associated with the service including ability to operate in bad weather, vehicle operating cost, siting and cost of landing sites, and overall public acceptance of small, remotely operated aircraft.
The results from the model yielded sensitivity of demand to these challenges and especially showed a great difference in demand as the cost of operating the car decreases in the future, making it a great competitor to the ODM concept. The major challenge that aerospace engineers face is designing an aircraft capable of achieving lower operational costs. The results showed that in order for the ODM to be a competitive mode, the cost per passenger-mile should be kept at $1.
|
99 |
A model generalization study in localizing indoor cows with cow localization (colo) datasetDas, Mautushi 10 July 2024 (has links)
Precision livestock farming increasingly relies on advanced object localization techniques to monitor livestock health and optimize resource management. In recent years, computer vision-based localization methods have been widely used for animal localization. However, certain challenges still make the task difficult, such as the scarcity of data for model fine-tuning and the inability to generalize models effectively. To address these challenges, we introduces COLO (COw LOcalization), a publicly available dataset comprising localization data for Jersey and Holstein cows under various lighting conditions and camera angles. We evaluate the performance and generalization capabilities of YOLOv8 and YOLOv9 model variants using this dataset.
Our analysis assesses model robustness across different lighting and viewpoint configurations and explores the trade-off between model complexity, defined by the number of learnable parameters, and performance. Our findings indicate that camera viewpoint angle is the most critical factor for model training, surpassing the influence of lighting conditions. Higher model complexity does not necessarily guarantee better results; rather, performance is contingent on specific data and task requirements. For our dataset, medium complexity models generally outperformed both simpler and more complex models.
Additionally, we evaluate the performance of fine-tuned models across various pre-trained weight initialization. The results demonstrate that as the amount of training samples increases, the advantage of using weight initialization diminishes. This suggests that for large datasets, it may not be necessary to invest extra effort in fine-tuning models with custom weight initialization.
In summary, our study provides comprehensive insights for animal and dairy scientists to choose the optimal model for cow localization performance, considering factors such as lighting, camera angles, model parameters, dataset size, and different weight initialization criteria. These findings contribute to the field of precision livestock farming by enhancing the accuracy and efficiency of cow localization technology. The COLO dataset, introduced in this study, serves as a valuable resource for the research community, enabling further advancements in object detection models for precision livestock farming. / Master of Science / Cow localization is important for many reasons. Farmers want to monitor cows to understand their behavior, count cows in a scene, and track their activities such as eating and grazing. Popular technologies like GPS or other tracking devices need to be worn by cows in the form of collars, ear tags etc. This requires manually putting the device on each cow, which is labor-intensive and costly since each cow needs its own device.
In contrast, computer vision-based methods need only one camera to effectively track and monitor cows. We can use deep learning models and a camera to detect cows in a scene. This method is cost-effective and does not require strict maintenance.
However, this approach still has challenges. Deep learning models need a large amount of data to train, and there is a lack of annotated data in our community. Data collection and preparation for model training require human labor and technical skills. Additionally, to make the model robust, it needs to be adjusted effectively, a process called model generalization.
Our work addresses these challenges with two main contributions. First, we introduce a new dataset called COLO (COw LOcalization). This dataset consists of over 1,000 annotated images of Holstein and Jersey cows. Anyone can use this data to train their models. Second, we demonstrate how to generalize models. This model generalization method is not only applicable for cow localization but can also be adapted for other purposes whenever deep learning models are used.
In numbers, we found that the YOLOv8m model is the optimal model for cow localization using our dataset. Additionally, we discovered that camera angle is a crucial factor for model generalization. This means that where we place the camera on the farm is important for getting accurate predictions. We found that top angles (placing the camera above) provide better accuracy.
|
100 |
Forward and Inverse Modeling of Tsunami Sediment TransportTang, Hui 21 April 2017 (has links)
Tsunami is one of the most dangerous natural hazards in the coastal zone worldwide. Large tsunamis are relatively infrequent. Deposits are the only concrete evidence in the geological record with which we can determine both tsunami frequency and magnitude. Numerical modeling of sediment transport during a tsunami is important interdisciplinary research to estimate the frequency and magnitude of past events and quantitative prediction of future events. The goal of this dissertation is to develop robust, accurate, and computationally efficient models for sediment transport during a tsunami. There are two different modeling approaches (forward and inverse) to investigate sediment transport. A forward model consists of tsunami source, hydrodynamics, and sediment transport model. In this dissertation, we present one state-of-the-art forward model for Sediment TRansport In Coastal Hazard Events (STRICHE), which couples with GeoClaw and is referred to as GeoClaw-STRICHE. In an inverse model, deposit characteristics, such as grain-size distribution and thickness, are inputs to the model, and flow characteristics are outputs. We also depict one trial-and-error inverse model (TSUFLIND) and one data assimilation inverse model (TSUFLIND-EnKF) in this dissertation. All three models were validated and verified against several theoretical, experimental, and field cases. / Ph. D. / Population living close to coastlines is increasing, which creates higher risks due to coastal hazards, such as tsunami. Tsunamis are a series of long waves triggered by earthquakes, volcanic eruptions, landslides, and meteorite impacts. Deposits are the only concrete evidence in geological records that can be used to determine both tsunami frequency and magnitude. The numerical modeling of sediment transport in coastal hazard events is an important interdisciplinary research area to estimate the magnitude their magnitude. The goal of this dissertation is to develop several robust, accurate, and computationally efficient forward and inverse models for tsunami sediment transport. In Chapter one, a general literature review is given. Chapter two will discuss a new model for TSUunami FLow INversion from Deposits (TSUFLIND). TSUFLIND incorporates three models and adds new modules to simulate tsunami deposit formation and calculate flow condition. In Chapter three, we present an inverse model based on ensemble Kalman filtering (TSUFLIND-EnKF) to infer tsunami characteristics from deposits. This model is the first model that forms a system state to include both observable variables and unknown parameters. In Chapter four, we present a new forward model for simulating Sediment TRansport in Coastal Hazard Events, which combines with GeoClaw (GeoClaw-STRICHE). In Chapter five, we discuss the future works for TSUFLIND, TSUFLIND-EnKF, GeoClaw-STRICHE and forward-inverse framework.
|
Page generated in 0.0853 seconds