Spelling suggestions: "subject:"data deriven"" "subject:"data dcdriven""
311 |
以語料庫為本之近似詞教學成效之研究:以台灣大學生為例 / The Effect of Teaching Near-synonyms to Taiwan EFL University Students: A Corpus-based Approach陳聖其, Chen, Sheng Chi Unknown Date (has links)
台灣英語教育多以考試取向為主,許多教師進行英語字彙指導時採用填鴨式教學,致使學生無法於新的情境靈活使用字彙。
本研究旨在於探究以語料庫為本之教學對於台灣大學生在英語近似詞學習成效的影響,以台北市某一所大學86位英語學習背景及能力相似之大一生為研究對象。研究人數均分成兩班進行教學實驗,一班為實驗組,以資料觀察法進行教學,另一班為對照組,以傳統形式教學為主,每週一次五十分鐘,共進行十週。資料蒐集包含近似詞學習成就測驗前、後測,並且依據研究對象於實驗教學結束後接受語料觀察教學法回饋問卷,蒐集研究對象對於語料觀察法之反應與感知,進行量化分析。最後,透過訪談高分組和低分組學生,蒐集其質性資料進行研究探討哪些因素會影響不同英語能力學生對於資料觀察法的意願與需求。本研究發現如下:
一、近似詞教學有助於提升台灣大學生的英語字彙能力。兩組教學均在後測有
進步。但就後測成績來說,實驗組顯著優於控制組。資料觀察法之近似詞教學
均較傳統教學法更能有效提升學生的英語字彙能力。
二、在不同程度的學生學習成效上,高、低分組學生均在後測成績有進步。對於
高分組而言,實驗組後測成績顯著優於控制組後測。但對於控制組而言,實驗
組的與控制組的後測成績未呈顯著差異。
三、大部分的學生對於運用資料觀察法學習單字均給予正面回饋,也肯定資料觀
察學習法活動的效益。另外,根據高、低分組學生訪談結果發現,英語程度的
高低的確會影響學生對於資料觀察法的喜愛和需求。高分組的學生希望先以資
料觀察學習法為開端,再以傳統講解式方式做總結。但對低分組的學生而言,
喜歡參與小組討論。由於單字量的不足,低分組學生希望在語料庫為主的教材
旁能附上中文解釋,降低學習焦慮。
根據上述研究結果,本研究建議大學英語教師在教學現場能夠融入語料觀察學 習法並依照不同程度的學生進行教材設計,以助提升學生學習英語單字。
關鍵字:資料觀察學習法、近似詞、語料庫為本 / Corpus Linguistics has progressively become the center in different domains of language research. With such development of large corpora, the potential applications and possibilities of corpora in second language teaching and learning are extended. A discovery-based authentic learning environment is provided as well as created by such corpus-based language learning. Synonym or near-synonym learning is a difficult aspect of vocabulary learning, but a linguistic phenomenon with ubiquity. Hence, this research aims to investigate the effectiveness of the application of data-driven learning (DDL) approach in near-synonyms instruction and compare the teaching effect on the high and low achievers through the near-synonyms instruction.
Participants of this study were given instruction throughout the eight-week corpus-based teaching with materials compiled by the teacher. This is a quasi-experimental study consisting of comparison between two experimental conditions, with a pre-post test and control-experimental group design, followed by qualitative method of semi-structure interviews and questionnaire provided to the experimental group of EFL university students in Taiwan. Two intact classes of 86 college students participated in this study. The quantitative analysis of the pre- and posttest scores and questionnaire were conducted through descriptive statistics and frequency analysis in order to explain the learning effects and learners’ perceptions.
The results of the study revealed that: (1) participants in the experimental group made significant improvement in the posttest; (2) EFL high proficiency learners with DDL approach performed better than high achievers who were taught by the traditional method. However, low achievers may not be able to benefit from DDL approach in the form of concordance teaching materials; (3) the majority of the participants had positive feedback on DDL activities. Also, types of preferred DDL activities were strongly influenced by students’ proficiency level. Low achievers preferred activities that should involve Chinese translation as the supplementary note while as for the high achievers, they were looking for the teacher’s explanation of words’ usages and functions in the end.
This study demonstrates the importance in illuminating the dynamic relationship between DDL approach and second language near-synonyms learning, as well as provides English EFL teachers with a better concept to incorporate corpus or concordance lines into vocabulary instruction.
Key words: data-driven Learning, near-synonym, corpus-based approach
|
312 |
Data-driven estimation for Aalen's additive risk modelBoruvka, Audrey 02 August 2007 (has links)
The proportional hazards model developed by Cox (1972) is by far the most widely used method for regression analysis of censored survival data. Application of the Cox model to more general event history data has become possible through extensions using counting process theory (e.g., Andersen and Borgan (1985), Therneau and Grambsch (2000)). With its development based entirely on counting processes, Aalen’s additive risk model offers a flexible, nonparametric alternative. Ordinary least squares, weighted least squares and ridge regression have been proposed in the literature as estimation schemes for Aalen’s model (Aalen (1989), Huffer and McKeague (1991), Aalen et al. (2004)). This thesis develops data-driven parameter selection criteria for the weighted least squares and ridge estimators. Using simulated survival data, these new methods are evaluated against existing approaches. A survey of the literature on the additive risk model and a demonstration of its application to real data sets are also provided. / Thesis (Master, Mathematics & Statistics) -- Queen's University, 2007-07-18 22:13:13.243
|
313 |
Theory and Practice of Globally Optimal Deformation EstimationTian, Yuandong 01 September 2013 (has links)
Nonrigid deformation modeling and estimation from images is a technically challenging task due to its nonlinear, nonconvex and high-dimensional nature. Traditional optimization procedures often rely on good initializations and give locally optimal solutions. On the other hand, learning-based methods that directly model the relationship between deformed images and their parameters either cannot handle complicated forms of mapping, or suffer from the Nyquist Limit and the curse of dimensionality due to high degrees of freedom in the deformation space. In particular, to achieve a worst-case guarantee of ∈ error for a deformation with d degrees of freedom, the sample complexity required is O(1/∈d).
In this thesis, a generative model for deformation is established and analyzed using a unified theoretical framework. Based on the framework, three algorithms, Data-Driven Descent, Top-down and Bottom-up Hierarchical Models, are designed and constructed to solve the generative model. Under Lipschitz conditions that rule out unsolvable cases (e.g., deformation of a blank image), all algorithms achieve globally optimal solutions to the specific generative model. The sample complexity of these methods is substantially lower than that of learning-based approaches, which are agnostic to deformation modeling.
To achieve global optimality guarantees with lower sample complexity, the structureembedded in the deformation model is exploited. In particular, Data-driven Descentrelates two deformed images that are far away in the parameter space by compositionalstructures of deformation and reduce the sample complexity to O(Cd log 1/∈).Top-down Hierarchical Model factorizes the local deformation into patches once theglobal deformation has been estimated approximately and further reduce the samplecomplexity to O(Cd/1+C2 log 1/∈). Finally, the Bottom-up Hierarchical Model buildsrepresentations that are invariant to local deformation. With the representations, theglobal deformation can be estimated independently of local deformation, reducingthe sample complexity to O((C/∈)d0) (d0 ≪ d). From the analysis, this thesis showsthe connections between approaches that are traditionally considered to be of verydifferent nature. New theoretical conjectures on approaches like Deep Learning, arealso provided.
practice, broad applications of the proposed approaches have also been demonstrated to estimate water distortion, air turbulence, cloth deformation and human pose with state-of-the-art results. Some approaches even achieve near real-time performance. Finally, application-dependent physics-based models are built with good performance in document rectification and scene depth recovery in turbulent media.
|
314 |
An automated testing system for telephony software - A case studyZhou, Yingxiang Ingrid 11 March 2009 (has links)
As the complexity of software system increases, delivering quality software successfully
becomes an ever more challenging task. Applying automated testing techniques effectively to the
software development process can reduce software testing effort substantially and assure
software quality cost-effectively. Thus the future of software testing will rely heavily on
automated testing techniques.
This thesis describes a practical approach to automated software testing by investigating and
analyzing different automation test tools in real-world situations. In view of the fact that the key
to successful automated testing is planning, understanding the requirements for automated testing
and effectively planning is critical and essential.
iv
This thesis presents the design and implementation of an automated testing framework. It
consists of an automated testing tool, which is based on the commercial product TestComplete,
as well as associated testing processes. The application area is telephony communications
software. To demonstrate the viability of our automated testing approach, we apply our testing
framework to a Voice-over-IP telephony application called Desktop Assistant. This case study
illustrates the benefits and limitations of our automated testing approach effectively.
|
315 |
K-12 Professional Learning Communities (PLCs) in a Rural School District on the High Plains of Texas: Mechanism for Teacher Support of Innovative Formative Assessment and Instruction with Technology (iFAIT)Talkmitt, Marcia J. 03 October 2013 (has links)
The purpose of this study was to explore the evolution of collaborative practices of PLCs as they emerge when using technology based formative assessment via iFAIT or innovative Formative Assessment with Instruction and Technology developed by the researcher using audience response systems and the online data compiler, Eduphoria!. This study used sequential explanatory mixed methods to address the problems that schools face when implementing technology based formative assessments to improve instruction and student achievement.
A survey administered in September 2012 and again in December 2012 provided a measure of teacher use of formative assessments, technology use in formative assessments, and perceptions of teachers using the PLC as a mechanism of support for technology based formative assessment. Training was facilitated by the researcher as PLCs worked together to develop, administer, and interpret formative assessments. Teacher interviews were conducted, and the study ended with the administration of the December 2012 survey and open-response questions for further qualitative analysis.
Quantitative data analysis was completed using ANOVAs to determine if there were significant differences of teacher groups (subject taught, grade level taught, and years of teaching experience) use of iFAIT. This data analysis also included measures of frequency and paired sample t tests between the September and December 2012 responses. Qualitative data was analyzed using hand coding, word clouds, and WordSmith Tools. The triangulation of qualitative data in the quantitative data provided a narrative to document what collaborative factors affected the use of iFAIT.
For school improvement and implementation of iFAIT, the study revealed that (1) with the right technology infrastructure, on-going professional development must be offered by administrators or sought after by teachers; (2) teachers must have strong beliefs in formative assessment and the technology that supports it; (3) open lines of communication must be supported through the PLC and administration; (4) teachers must see purpose in using revealing student data to drive instruction; and (5) PLCs must have common beliefs and believe that student achievement is connected to school improvement. PLCs should discuss data, share successes, and plan instruction through extended involvement in face-to-face and online venues as communities of practice.
|
316 |
A Multi-Sensor Data Fusion Approach for Real-Time Lane-Based Traffic EstimationJanuary 2015 (has links)
abstract: Modern intelligent transportation systems (ITS) make driving more efficient, easier, and safer. Knowledge of real-time traffic conditions is a critical input for operating ITS. Real-time freeway traffic state estimation approaches have been used to quantify traffic conditions given limited amount of data collected by traffic sensors. Currently, almost all real-time estimation methods have been developed for estimating laterally aggregated traffic conditions in a roadway segment using link-based models which assume homogeneous conditions across multiple lanes. However, with new advances and applications of ITS, knowledge of lane-based traffic conditions is becoming important, where the traffic condition differences among lanes are recognized. In addition, most of the current real-time freeway traffic estimators consider only data from loop detectors. This dissertation develops a bi-level data fusion approach using heterogeneous multi-sensor measurements to estimate real-time lane-based freeway traffic conditions, which integrates a link-level model-based estimator and a lane-level data-driven estimator.
Macroscopic traffic flow models describe the evolution of aggregated traffic characteristics over time and space, which are required by model-based traffic estimation approaches. Since current first-order Lagrangian macroscopic traffic flow model has some unrealistic implicit assumptions (e.g., infinite acceleration), a second-order Lagrangian macroscopic traffic flow model has been developed by incorporating drivers’ anticipation and reaction delay. A multi-sensor extended Kalman filter (MEKF) algorithm has been developed to combine heterogeneous measurements from multiple sources. A MEKF-based traffic estimator, explicitly using the developed second-order traffic flow model and measurements from loop detectors as well as GPS trajectories for given fractions of vehicles, has been proposed which gives real-time link-level traffic estimates in the bi-level estimation system.
The lane-level estimation in the bi-level data fusion system uses the link-level estimates as priors and adopts a data-driven approach to obtain lane-based estimates, where now heterogeneous multi-sensor measurements are combined using parallel spatial-temporal filters.
Experimental analysis shows that the second-order model can more realistically reproduce real world traffic flow patterns (e.g., stop-and-go waves). The MEKF-based link-level estimator exhibits more accurate results than the estimator that uses only a single data source. Evaluation of the lane-level estimator demonstrates that the proposed new bi-level multi-sensor data fusion system can provide very good estimates of real-time lane-based traffic conditions. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2015
|
317 |
Efficient Bayesian methods for mixture models with genetic applicationsZuanetti, Daiane Aparecida 14 December 2016 (has links)
Submitted by Alison Vanceto (alison-vanceto@hotmail.com) on 2017-01-16T12:38:12Z
No. of bitstreams: 1
TeseDAZ.pdf: 20535130 bytes, checksum: 82585444ba6f0568a20adac88fdfc626 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2017-01-17T11:47:35Z (GMT) No. of bitstreams: 1
TeseDAZ.pdf: 20535130 bytes, checksum: 82585444ba6f0568a20adac88fdfc626 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2017-01-17T11:47:42Z (GMT) No. of bitstreams: 1
TeseDAZ.pdf: 20535130 bytes, checksum: 82585444ba6f0568a20adac88fdfc626 (MD5) / Made available in DSpace on 2017-01-17T11:47:50Z (GMT). No. of bitstreams: 1
TeseDAZ.pdf: 20535130 bytes, checksum: 82585444ba6f0568a20adac88fdfc626 (MD5)
Previous issue date: 2016-12-14 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / We propose Bayesian methods for selecting and estimating di erent types of mixture models which are widely used in Genetics and Molecular Biology. We speci cally propose data-driven selection and estimation methods for a generalized mixture model, which accommodates the usual (independent) and the rst-order (dependent) models in one framework, and QTL (quantitative trait locus) mapping models for independent and pedigree data. For clustering genes through a mixture model, we propose three nonparametric Bayesian methods: a marginal nested Dirichlet process (NDP), which is able to cluster distributions and, a predictive recursion clustering scheme (PRC) and a subset nonparametric Bayesian (SNOB) clustering algorithm for clustering big data. We analyze and compare the performance of the proposed methods and traditional procedures of selection, estimation and clustering in simulated and real data sets. The proposed methods are more exible, improve the convergence of the algorithms and provide more accurate estimates in many situations. In addition, we propose methods for predicting
nonobservable QTLs genotypes and missing parents and improve the Mendelian probability of inheritance of nonfounder genotype using conditional independence structures. We also suggest applying diagnostic measures to check the goodness of t of QTL mapping models. / N os propomos métodos Bayesianos para selecionar e estimar diferentes tipos de modelos de mistura que são amplamente utilizados em Genética e Biologia
Molecular. Especificamente, propomos métodos direcionados pelos dados para
selecionar e estimar um modelo de mistura generalizado, que descreve o modelo
de mistura usual (independente) e o de primeira ordem numa mesma estrutura,
e modelos de mapeamento de QTL com dados independentes e familiares. Para agrupar genes através de modelos de mistura, nós propomos três métodos Bayesianos
não-paramétricos: o processo de Dirichlet aninhado que possibilita agrupamento
de distribuições e, um algoritmo preditivo recursivo e outro Bayesiano nãoparamétrico exato para agrupar dados de alta dimensão. Analisamos e comparamos o desempenho dos métodos propostos e dos procedimentos tradicionais de seleção e estimação de modelos e agrupamento de dados em conjuntos de dados simulados
e reais. Os métodos propostos são mais
extáveis, aprimoram a convergência dos
algoritmos e apresentam estimativas mais precisas em muitas situações. Além disso,
nós propomos procedimentos para predizer o genótipo não observável dos QTLs e
de pais faltantes e melhorar a probabilidade Mendeliana de herança genética do
genótipo dos descendentes através da estrutura de independência condicional entre
os indivíduos. Também sugerimos aplicar medidas de diagnóstico para verificar a
qualidade do ajuste dos modelos de mapeamento de QTLs.
|
318 |
E-tjänstutveckling ur ett medborgarperspektiv : Att skapa beslutsunderlag baserat på medborgarärendens lämplighet för olika kommunikationskanaler / Citizen-centric e-service developmentAbrahamsson, Johan, Sjöberg, Robin January 2009 (has links)
Citizens’ interaction with governments is an area with unique implications for channel management. Governments need to take the citizens perspective into further consideration in order to be successful in delivering high-quality e-services. This paper aims to determine if a categorization of citizen-initiated contacts from a citizen-centric perspective can be a valuable basis for decisions regarding e-service development. The study consisted of three steps. The first step was an examination of the existing related literature, which resulted in the uncovering of the most important aspects of citizens channel choice. The second step consisted of an elaboration of a classification based on perceived task characteristics and a subsequent matching of the categories to desirable channel characteristics. The third and final step consisted of an application of the proposed categorization on a content management system containing all citizen-initiated contacts in a Swedish municipality. The application indicated that the proposed categorization could possibly be used to guide investments in e-services towards a channel-appropriate direction.
|
319 |
Uso de um método preditivo para inferir a zona de aprendizagem de alunos de programação em um ambiente de correção automática de códigoPereira, Filipe Dwan, 95-99119-6508 29 March 2018 (has links)
Submitted by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2018-06-04T13:02:42Z
No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Filipe Dwan.pdf: 3617202 bytes, checksum: 21261ba9c1db7a40af29004bd0bb6f52 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2018-06-04T13:02:58Z (GMT) No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Filipe Dwan.pdf: 3617202 bytes, checksum: 21261ba9c1db7a40af29004bd0bb6f52 (MD5) / Made available in DSpace on 2018-06-04T13:02:58Z (GMT). No. of bitstreams: 2
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Filipe Dwan.pdf: 3617202 bytes, checksum: 21261ba9c1db7a40af29004bd0bb6f52 (MD5)
Previous issue date: 2018-03-29 / CS1 (first year programming) classes are known to have a high dropout and non-pass
rate. Thus, there have been many studies attempting to predict and alleviate CS1 student
performance. Knowing about student performance in advance can be useful for many reasons.
For example, teachers can apply specific actions to help learners who are struggling,
as well as provide more challenging activities to high-achievers. Initial studies used static
factors, such as: high school grades, age, gender. However, student behavior is dynamic
and, as such, a data-driven approach has been gaining more attention, since many
universities are using web-based environments to support CS1 classes. Thereby, many
researchers have started extracting student behavior by cleaning data collected from these
environments and using them as features in machine learning (ML) models. Recently, the
research community has proposed many predictive methods available, even though many
of these studies would need to be replicated, to check if they are context-sensitive. Thus,
we have collected a set of successful features correlated with the student grade used in
related studies, compiling the best ML attributes, as well as adding new features, and
applying them on a database representing 486 CS1 students. The set of features was used
in ML pipelines which were optimized with two approaches: hyperparameter-tuning
with random search and genetic programming. As a result, we achieved an accuracy of
74.44%, using data from the first two weeks to predict student final grade, which outperforms
a state-of-the-art research applied to the same dataset. It is also worth noting that
from the eighth week of class, the method achieved accuracy between 85% and 90.62%. / Em média, um terço dos alunos no mundo reprova em disciplinas de introdução à
programação de computadores (IPC). Assim, muitos estudos vêm sendo conduzidos a
fim de inferir o desempenho de estudantes de turmas de IPC. Inicialmente, pesquisadores
investigavam a relação das notas dos alunos com fatores estáticos como: notas no
ensino médio, gênero, idade e outros. Entretanto, o comportamento dos estudantes
é dinâmico e, dessa forma, abordagens orientadas aos dados vêm ganhando atenção,
uma vez que muitas universidades utilizam ambientes web para turmas de programação
como juízes online. Com efeito, muitos pesquisadores vêm extraindo e tratando os
dados dos estudantes a partir desses ambientes e usando-os como atributos de algoritmos
de aprendizagem de máquina para a construção de modelos preditivos. No entanto, a
comunidade científica sugere que tais estudos sejam reproduzidos a fim de investigar
se eles são generalizáveis a outras bases de dados educacionais. Neste sentido, neste
trabalho apresentou-se um método que emprega um conjunto de atributos correlacionados
com as notas dos estudantes, sendo alguns baseados em trabalhos relacionados e outros
propostos nesta pesquisa, a fim de realizar a predição do desempenho dos alunos nas
avaliações intermediárias e nas médias finais. Tal método foi aplicado a uma base de
dados com 486 alunos de IPC. O conjunto de atributos chamado de perfil de programação
foi empregado em algoritmos de aprendizagem de máquina e otimizado utilizando
duas abordagens: a) ajuste de hiperparâmetros com random search e b) construção do
pipeline de aprendizagem de máquina utilizando algoritmos evolutivos. Como resultado,
atingiu-se 74,44% de acurácia na tarefa de identificar se os alunos iriam ser reprovados
ou aprovados usando os dados das duas semanas de aula em uma base de dados
balanceada. Esse resultado foi estatisticamente superior ao baseline. Destaca-se ainda
que a partir da oitava semana de aula, o método atingiu acurácias entre 85% e 90,62%.
|
320 |
Data-driven decision making in Marketing : A theoretical approachPeyne, Benjamin, Chan, Ariane January 2017 (has links)
Customer insight is at the heart of the big data era. This revolution makesit possible to directly obtain high potential data and in large quantitiesabout customers. Thus we take note that, more than ever, a large volumeof big data is collected by companies.We observe that big data have become a necessary tool within marketing.More and more companies orient their decisions according to theinformations provided by data, with the aim of quickly having betterresults.Nevertheless, in order to integrate these big data in a better way and gaina competitive advantage, companies must face new challenges. Tomeasure and understand the impact of big data in marketing decisions,we propose, with the support of our scientific and theoretical resources, areasoning, demonstrating all the issues. Big data is increasinglyubiquitous and necessary for companies (I). Their impact in decisionsneeds to be taken into account (II) and their use is leading to amanagement revolution (III). Moreover, it modifies the close relationbetween decision and intuition. (IV). In this article, we present aperspective that study all these concepts. We close by offering a modeland a conclusion answering our problematic.
|
Page generated in 0.0443 seconds