• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 9
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 40
  • 40
  • 11
  • 10
  • 8
  • 8
  • 7
  • 7
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Aperfeiçoamento do algoritmo algébrico sequencial para a identificação de variações abruptas de impedância acústica via otimização / Identification of rough impedance profile using an improved acoustic wave propagation algorithm

Filipe Otsuka Taminato 21 February 2014 (has links)
Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro / Neste trabalho são utilizados a técnica baseada na propagação de ondas acústicas e o método de otimização estocástica Luus-Jaakola (LJ) para solucionar o problema inverso relacionado à identificação de danos em barras. São apresentados o algoritmo algébrico sequencial (AAS) e o algoritmo algébrico sequencial aperfeiçoado (AASA) que modelam o problema direto de propagação de ondas acústicas em uma barra. O AASA consiste nas modificações introduzidas no AAS. O uso do AASA resolve com vantagens o problema de identificação de danos com variações abruptas de impedância. Neste trabalho são obtidos, usando-se o AAS-LJ e o AASA-LJ, os resultados de identificação de cinco cenários de danos. Três deles com perfil suave de impedância acústica generalizada e os outros dois abruptos. Além disso, com o objetivo de simular sinais reais de um experimento, foram introduzidos variados níveis de ruído. Os resultados alcançados mostram que o uso do AASA-LJ na resolução de problemas de identificação de danos em barras é bastante promissor, superando o AAS-LJ para perfis abruptos de impedância. / In this work the techniques based on the wave propagation approach and the Luus- Jaakola optimization method to solve the inverse problem of damage identification in bars are applied. The sequential algebraic algorithm (SAA) and the improved sequential algebraic algorithm (ISAA) that model the direct problem of acoustic wave propagation in bars are presented. The ISAA consists on modifications of the SAA. The use of the ISAA solves with advantages the problem of damage identification when the generalized acoustical impedance variations are abrupt. In this work the results of identification of five damage scenarios are obtained using the SAA and the ISAA. Three of them are smooth impedance profiles and the other two are rough ones. Moreover, to simulate signals obtained experimentally, different noise levels were introduced. It is shown that using the ISAA-LJ in solving problems of damage identification in bars is quite promising, furnishing better results than the SAA-LJ, specially when the impedance profiles are abrupt.
32

Optimization Algorithms for Deterministic, Stochastic and Reinforcement Learning Settings

Joseph, Ajin George January 2017 (has links) (PDF)
Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in ma-chine learning, information retrieval, regression, estimation, operations research and a wide variety of computing domains. The subject is being deeply studied both theoretically and experimentally and several algorithms are available in the literature. These algorithms which can be executed (sequentially or concurrently) on a computing machine explore the space of input parameters to seek high quality solutions to the optimization problem with the search mostly guided by certain structural properties of the objective function. In certain situations, the setting might additionally demand for “absolute optimum” or solutions close to it, which makes the task even more challenging. In this thesis, we propose an optimization algorithm which is “gradient-free”, i.e., does not employ any knowledge of the gradient or higher order derivatives of the objective function, rather utilizes objective function values themselves to steer the search. The proposed algorithm is particularly effective in a black-box setting, where a closed-form expression of the objective function is unavailable and gradient or higher-order derivatives are hard to compute or estimate. Our algorithm is inspired by the well known cross entropy (CE) method. The CE method is a model based search method to solve continuous/discrete multi-extremal optimization problems, where the objective function has minimal structure. The proposed method seeks, in the statistical manifold of the parameters which identify the probability distribution/model defined over the input space to find the degenerate distribution concentrated on the global optima (assumed to be finite in quantity). In the early part of the thesis, we propose a novel stochastic approximation version of the CE method to the unconstrained optimization problem, where the objective function is real-valued and deterministic. The basis of the algorithm is a stochastic process of model parameters which is probabilistically dependent on the past history, where we reuse all the previous samples obtained in the process till the current instant based on discounted averaging. This approach can save the overall computational and storage cost. Our algorithm is incremental in nature and possesses attractive features such as stability, computational and storage efficiency and better accuracy. We further investigate, both theoretically and empirically, the asymptotic behaviour of the algorithm and find that the proposed algorithm exhibits global optimum convergence for a particular class of objective functions. Further, we extend the algorithm to solve the simulation/stochastic optimization problem. In stochastic optimization, the objective function possesses a stochastic characteristic, where the underlying probability distribution in most cases is hard to comprehend and quantify. This begets a more challenging optimization problem, where the ostentatious nature is primarily due to the hardness in computing the objective function values for various input parameters with absolute certainty. In this case, one can only hope to obtain noise corrupted objective function values for various input parameters. Settings of this kind can be found in scenarios where the objective function is evaluated using a continuously evolving dynamical system or through a simulation. We propose a multi-timescale stochastic approximation algorithm, where we integrate an additional timescale to accommodate the noisy measurements and decimate the effects of the gratuitous noise asymptotically. We found that if the objective function and the noise involved in the measurements are well behaved and the timescales are compatible, then our algorithm can generate high quality solutions. In the later part of the thesis, we propose algorithms for reinforcement learning/Markov decision processes using the optimization techniques we developed in the early stage. MDP can be considered as a generalized framework for modelling planning under uncertainty. We provide a novel algorithm for the problem of prediction in reinforcement learning, i.e., estimating the value function of a given stationary policy of a model free MDP (with large state and action spaces) using the linear function approximation architecture. Here, the value function is defined as the long-run average of the discounted transition costs. The resource requirement of the proposed method in terms of computational and storage cost scales quadratically in the size of the feature set. The algorithm is an adaptation of the multi-timescale variant of the CE method proposed in the earlier part of the thesis for simulation optimization. We also provide both theoretical and empirical evidence to corroborate the credibility and effectiveness of the approach. In the final part of the thesis, we consider a modified version of the control problem in a model free MDP with large state and action spaces. The control problem most commonly addressed in the literature is to find an optimal policy which maximizes the value function, i.e., the long-run average of the discounted transition payoffs. The contemporary methods also presume access to a generative model/simulator of the MDP with the hidden premise that observations of the system behaviour in the form of sample trajectories can be obtained with ease from the model. In this thesis, we consider a modified version, where the cost function to be optimized is a real-valued performance function (possibly non-convex) of the value function. Additionally, one has to seek the optimal policy without presuming access to the generative model. In this thesis, we propose a stochastic approximation algorithm for this peculiar control problem. The only information, we presuppose, available to the algorithm is the sample trajectory generated using a priori chosen behaviour policy. The algorithm is data (sample trajectory) efficient, stable, robust as well as computationally and storage efficient. We provide a proof of convergence of our algorithm to a high performing policy relative to the behaviour policy.
33

Otimiza??o de forma aplicando B-splines sob crit?rio integral de tens?es

Lins, Sidney de Oliveira 09 February 2009 (has links)
Made available in DSpace on 2014-12-17T14:57:51Z (GMT). No. of bitstreams: 1 SidneyOL.pdf: 4301786 bytes, checksum: 9f7a7a0d30a925198ccebaa046c885a4 (MD5) Previous issue date: 2009-02-09 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / This work proposes a computational methodology to solve problems of optimization in structural design. The application develops, implements and integrates methods for structural analysis, geometric modeling, design sensitivity analysis and optimization. So, the optimum design problem is particularized for plane stress case, with the objective to minimize the structural mass subject to a stress criterion. Notice that, these constraints must be evaluated at a series of discrete points, whose distribution should be dense enough in order to minimize the chance of any significant constraint violation between specified points. Therefore, the local stress constraints are transformed into a global stress measure reducing the computational cost in deriving the optimal shape design. The problem is approximated by Finite Element Method using Lagrangian triangular elements with six nodes, and use a automatic mesh generation with a mesh quality criterion of geometric element. The geometric modeling, i.e., the contour is defined by parametric curves of type B-splines, these curves hold suitable characteristics to implement the Shape Optimization Method, that uses the key points like design variables to determine the solution of minimum problem. A reliable tool for design sensitivity analysis is a prerequisite for performing interactive structural design, synthesis and optimization. General expressions for design sensitivity analysis are derived with respect to key points of B-splines. The method of design sensitivity analysis used is the adjoin approach and the analytical method. The formulation of the optimization problem applies the Augmented Lagrangian Method, which convert an optimization problem constrained problem in an unconstrained. The solution of the Augmented Lagrangian function is achieved by determining the analysis of sensitivity. Therefore, the optimization problem reduces to the solution of a sequence of problems with lateral limits constraints, which is solved by the Memoryless Quasi-Newton Method It is demonstrated by several examples that this new approach of analytical design sensitivity analysis of integrated shape design optimization with a global stress criterion purpose is computationally efficient / Neste trabalho prop?e-se uma metodologia computacional para resolver problemas de Otimiza??o de Forma para projeto estrutural. A aplica??o ? particularizada para problemas bidimensionais em estado plano de tens?es, de modo a minimizar a massa atendendo um crit?rio de tens?o. Para atender ao crit?rio param?trico de tens?es ? proposto um crit?rio global de tens?o de von Mises, dessa maneira, amplia-se o crit?rio local de tens?es sobre o dom?nio, visando ? obten??o de programas mais seguros. O problema ? aproximado pelo M?todo dos Elementos Finitos utilizando elementos triangulares da base Lagrangiana padr?o com seis n?s, tendo uma estrat?gia de gera??o autom?tica de malhas baseada em um crit?rio geom?trico do elemento. O modelo geom?trico do contorno material ? definido por curvas param?tricas B-splines. Estas curvas possuem caracter?sticas vantajosas para implementa??o do processo de otimiza??o de forma, que se utiliza dos pontos-chave para determinar o m?nimo do problema. A formula??o do problema de otimiza??o faz uso do M?todo Lagrangiano Aumentado, que transforma o problema de otimiza??o com restri??o, em problema irrestrito. A solu??o da fun??o Lagrangiana Aumentada ? alcan?ada pela determina??o da an?lise das sensibilidades anal?ticas em rela??o aos pontos-chave da curva B-spline. Como conseq??ncia, o problema de otimiza??o reduz-se ? solu??o de uma seq??ncia de problemas de limites laterais do tipo caixa, o qual ? resolvido por um m?todo de proje??o de segunda ordem que usa o m?todo de Quase-Newton projetado sem mem?ria. S?o demonstrados v?rios exemplos para o M?todo de Otimiza??o de Forma integrado a An?lise da Sensibilidade Anal?tica sob o crit?rio global de tens?o de von Mises
34

Aperfeiçoamento do algoritmo algébrico sequencial para a identificação de variações abruptas de impedância acústica via otimização / Identification of rough impedance profile using an improved acoustic wave propagation algorithm

Filipe Otsuka Taminato 21 February 2014 (has links)
Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro / Neste trabalho são utilizados a técnica baseada na propagação de ondas acústicas e o método de otimização estocástica Luus-Jaakola (LJ) para solucionar o problema inverso relacionado à identificação de danos em barras. São apresentados o algoritmo algébrico sequencial (AAS) e o algoritmo algébrico sequencial aperfeiçoado (AASA) que modelam o problema direto de propagação de ondas acústicas em uma barra. O AASA consiste nas modificações introduzidas no AAS. O uso do AASA resolve com vantagens o problema de identificação de danos com variações abruptas de impedância. Neste trabalho são obtidos, usando-se o AAS-LJ e o AASA-LJ, os resultados de identificação de cinco cenários de danos. Três deles com perfil suave de impedância acústica generalizada e os outros dois abruptos. Além disso, com o objetivo de simular sinais reais de um experimento, foram introduzidos variados níveis de ruído. Os resultados alcançados mostram que o uso do AASA-LJ na resolução de problemas de identificação de danos em barras é bastante promissor, superando o AAS-LJ para perfis abruptos de impedância. / In this work the techniques based on the wave propagation approach and the Luus- Jaakola optimization method to solve the inverse problem of damage identification in bars are applied. The sequential algebraic algorithm (SAA) and the improved sequential algebraic algorithm (ISAA) that model the direct problem of acoustic wave propagation in bars are presented. The ISAA consists on modifications of the SAA. The use of the ISAA solves with advantages the problem of damage identification when the generalized acoustical impedance variations are abrupt. In this work the results of identification of five damage scenarios are obtained using the SAA and the ISAA. Three of them are smooth impedance profiles and the other two are rough ones. Moreover, to simulate signals obtained experimentally, different noise levels were introduced. It is shown that using the ISAA-LJ in solving problems of damage identification in bars is quite promising, furnishing better results than the SAA-LJ, specially when the impedance profiles are abrupt.
35

Second-order derivatives for shape optimization with a level-set method / Dérivées secondes pour l'optimisation de formes par la méthode des lignes de niveaux

Vie, Jean-Léopold 16 December 2016 (has links)
Le but de cette thèse est de définir une méthode d'optimisation de formes qui conjugue l'utilisation de la dérivée seconde de forme et la méthode des lignes de niveaux pour la représentation d'une forme.On considèrera d'abord deux cas plus simples : un cas d'optimisation paramétrique et un cas d'optimisation discrète.Ce travail est divisé en quatre parties.La première contient le matériel nécessaire à la compréhension de l'ensemble de la thèse.Le premier chapitre rappelle des résultats généraux d'optimisation, et notamment le fait que les méthodes d'ordre deux ont une convergence quadratique sous certaines hypothèses.Le deuxième chapitre répertorie différentes modélisations pour l'optimisation de formes, et le troisième se concentre sur l'optimisation paramétrique puis l'optimisation géométrique.Les quatrième et cinquième chapitres introduisent respectivement la méthode des lignes de niveaux (level-set) et la méthode des éléments-finis.La deuxième partie commence par les chapitres 6 et 7 qui détaillent des calculs de dérivée seconde dans le cas de l'optimisation paramétrique puis géométrique.Ces chapitres précisent aussi la structure et certaines propriétés de la dérivée seconde de forme.Le huitième chapitre traite du cas de l'optimisation discrète.Dans le neuvième chapitre on introduit différentes méthodes pour un calcul approché de la dérivée seconde, puis on définit un algorithme de second ordre dans un cadre général.Cela donne la possibilité de faire quelques premières simulations numériques dans le cas de l'optimisation paramétrique (Chapitre 6) et dans le cas de l'optimisation discrète (Chapitre 7).La troisième partie est consacrée à l'optimisation géométrique.Le dixième chapitre définit une nouvelle notion de dérivée de forme qui prend en compte le fait que l'évolution des formes par la méthode des lignes de niveaux, grâce à la résolution d'une équation eikonale, se fait toujours selon la normale.Cela permet de définir aussi une méthode d'ordre deux pour l'optimisation.Le onzième chapitre détaille l'approximation d'intégrales de surface et le douzième chapitre est consacré à des exemples numériques.La dernière partie concerne l'analyse numérique d'algorithmes d'optimisation de formes par la méthode des lignes de niveaux.Le Chapitre 13 détaille la version discrète d'un algorithme d'optimisation de formes.Le Chapitre 14 analyse les schémas numériques relatifs à la méthodes des lignes de niveaux.Enfin le dernier chapitre fait l'analyse numérique complète d'un exemple d'optimisation de formes en dimension un, avec une étude des vitesses de convergence / The main purpose of this thesis is the definition of a shape optimization method which combines second-order differentiationwith the representation of a shape by a level-set function. A second-order method is first designed for simple shape optimization problems : a thickness parametrization and a discrete optimization problem. This work is divided in four parts.The first one is bibliographical and contains different necessary backgrounds for the rest of the work. Chapter 1 presents the classical results for general optimization and notably the quadratic rate of convergence of second-order methods in well-suited cases. Chapter 2 is a review of the different modelings for shape optimization while Chapter 3 details two particular modelings : the thickness parametrization and the geometric modeling. The level-set method is presented in Chapter 4 and Chapter 5 recalls the basics of the finite element method.The second part opens with Chapter 6 and Chapter 7 which detail the calculation of second-order derivatives for the thickness parametrization and the geometric shape modeling. These chapters also focus on the particular structures of the second-order derivative. Then Chapter 8 is concerned with the computation of discrete derivatives for shape optimization. Finally Chapter 9 deals with different methods for approximating a second-order derivative and the definition of a second-order algorithm in a general modeling. It is also the occasion to make a few numerical experiments for the thickness (defined in Chapter 6) and the discrete (defined in Chapter 8) modelings.Then, the third part is devoted to the geometric modeling for shape optimization. It starts with the definition of a new framework for shape differentiation in Chapter 10 and a resulting second-order method. This new framework for shape derivatives deals with normal evolutions of a shape given by an eikonal equation like in the level-set method. Chapter 11 is dedicated to the numerical computation of shape derivatives and Chapter 12 contains different numerical experiments.Finally the last part of this work is about the numerical analysis of shape optimization algorithms based on the level-set method. Chapter 13 is concerned with a complete discretization of a shape optimization algorithm. Chapter 14 then analyses the numerical schemes for the level-set method, and the numerical error they may introduce. Finally Chapter 15 details completely a one-dimensional shape optimization example, with an error analysis on the rates of convergence
36

TIME-VARYING FRACTIONAL-ORDER PID CONTROL FOR MITIGATION OF DERIVATIVE KICK

Attila Lendek (10734243) 05 May 2021 (has links)
<div>In this thesis work, a novel approach for the design of a fractional order proportional integral</div><div>derivative (FOPID) controller is proposed. This design introduces a new time-varying FOPID controller</div><div>to mitigate a voltage spike at the controller output whenever a sudden change to the setpoint occurs. The</div><div>voltage spike exists at the output of the proportional integral derivative (PID) and FOPID controllers when a</div><div>derivative control element is involved. Such a voltage spike may cause a serious damage to the plant if it is</div><div>left uncontrolled. The proposed new FOPID controller applies a time function to force the derivative gain to</div><div>take effect gradually, leading to a time-varying derivative FOPID (TVD-FOPID) controller, which maintains</div><div>a fast system response and signi?cantly reduces the voltage spike at the controller output. The time-varying</div><div>FOPID controller is optimally designed using the particle swarm optimization (PSO) or genetic algorithm</div><div>(GA) to ?nd the optimum constants and time-varying parameters. The improved control performance is</div><div>validated through controlling the closed-loop DC motor speed via comparisons between the TVD-FOPID</div><div>controller, traditional FOPID controller, and time-varying FOPID (TV-FOPID) controller which is created</div><div>for comparison with all three PID gain constants replaced by the optimized time functions. The simulation</div><div>results demonstrate that the proposed TVD-FOPID controller not only can achieve 80% reduction of voltage</div><div>spike at the controller output but also is also able to keep approximately the same characteristics of the system</div><div>response in comparison with the regular FOPID controller. The TVD-FOPID controller using a saturation</div><div>block between the controller output and the plant still performs best according to system overshoot, rise time,</div><div>and settling time.</div>
37

Enhancing Safety for Autonomous Systems via Reachability and Control Barrier Functions

Jason King Ching Lo (10716705) 06 May 2021 (has links)
<div>In this thesis, we explore different methods to enhance the safety and robustness for autonomous systems. We achieve this goal using concepts and tools from reachability analysis and control barrier functions. We first take on a multi-player reach-avoid game that involves two teams of players with competing objectives, namely the attackers and the defenders. We analyze the problem and solve the game from the attackers' perspectives via a moving horizon approach. The resulting solution provides a safety guarantee that allows attackers to reach their goals while avoiding all defenders. </div><div><br></div><div>Next, we approach the problem of target re-association after long-term occlusion using concepts from reachability as well as Bayesian inference. Here, we set out to find the probability identity matrix that associates the identities of targets before and after an occlusion. The solution of this problem can be used in conjunction with existing state-of-the-art trackers to enhance their robustness.</div><div><br></div><div>Finally, we turn our attention to a different method for providing safety guarantees, namely control barrier functions. Since the existence of a control barrier function implies the safety of a control system, we propose a framework to learn such function from a given user-specified safety requirement. The learned CBF can be applied on top of an existing nominal controller to provide safety guarantees for systems.</div>
38

Sur l’ordonnancement d’ateliers job-shop flexibles et flow-shop en industries pharmaceutiques : optimisation par algorithmes génétiques et essaims particulaires / On flexible job-shop and pharmaceutical industries flow-shop schedulings by particle swarm and genetic algorithm optimization

Boukef, Hela 03 July 2009 (has links)
Pour la résolution de problèmes d’ordonnancement d’ateliers de type flow-shop en industries pharmaceutiques et d’ateliers de type job-shop flexible, deux méthodes d’optimisation ont été développées : une méthode utilisant les algorithmes génétiques dotés d’un nouveau codage proposé et une méthode d’optimisation par essaim particulaire modifiée pour être exploitée dans le cas discret. Les critères retenus dans le cas de lignes de conditionnement considérées sont la minimisation des coûts de production ainsi que des coûts de non utilisation des machines pour les problèmes multi-objectifs relatifs aux industries pharmaceutiques et la minimisation du Makespan pour les problèmes mono-objectif des ateliers job-shop flexibles.Ces méthodes ont été appliquées à divers exemples d’ateliers de complexités distinctes pour illustrer leur mise en œuvre. L’étude comparative des résultats ainsi obtenus a montré que la méthode basée sur l’optimisation par essaim particulaire est plus efficace que celle des algorithmes génétiques, en termes de rapidité de la convergence et de l’approche de la solution optimale / For flexible job-shop and pharmaceutical flow-shop scheduling problems resolution, two optimization methods are considered: a genetic algorithm one using a new proposed coding and a particle swarm optimization one modified in order to be used in discrete cases.The criteria retained for the considered packaging lines in pharmaceutical industries multi-objective problems are production cost minimization and total stopping cost minimization. For the flexible job-shop scheduling problems treated, the criterion taken into account is Makespan minimization.These two methods have been applied to various work-shops with distinct complexities to show their efficiency.After comparison of these methods, the obtained results allowed us to notice the efficiency of the based particle swarm optimization method in terms of convergence and reaching optimal solution
39

Stanovení funkčních objemů nádrže s uvažováním nejistot vstupních dat / Determination of the functional volumes of the reservoir considering input data uncertainties

Paseka, Stanislav Unknown Date (has links)
Damaging changes and interventions in the water cycle in our landscape caused mainly in the last century together with uncertainties from climate change are the cause of more frequent occurrences of hydrological extremes. In Hydrology, the most urgent problem is that the values of the long-term mean flows are decreasing in rivers as well as the yield of groundwater sources, but on the other hand, we cannot forget to the problem of extreme floods. In these consequences developing methods and tools to uncertainty analysis of the reservoir yield and of the reservoir flood protection is very important, useful and desired. The main aim was to determine the functional volumes of the reservoir considering input data measurement uncertainties and to quantify them and was explained how uncertainty took into account in results. The active storage capacity was determined from the historical series of monthly flows that were affected by uncertainties, next were applied on water evaporation, seepage losses of the dam and morphological volume-area curves. The simulation-optimization reservoir model was developed and temporal reliability as reservoir yield performance measures was applied. This model will extend the existing UNCE_RESERVOIR software. The flood capacity was determined from random flood wave variations were obtained by repeatedly generating uncertainty on the flood hydrograph. Software was developed based on the modified Klemes method, which was able to transform flood waves. The measurement uncertainties of data inputs were created using Monte Carlo method in both softwares. By connecting two softwares, the functional volumes of the reservoir under conditions of measurement uncertainties were complexly determined. The case study was applied to the real water reservoir, in the Morava River Basin. The result will be whether the dam is resistant to the current conditions, or the optimal design of the functional volumes of reservoir under conditions uncertainties.
40

PDF document search within a very large database

Wang, Lizhong January 2017 (has links)
Digital search engine, taking a search request from user and then returning a result responded to the request to the user, is indispensable for modern humans who are used to surfing the Internet. On the other hand, the digital document PDF is accepted by more and more people and becomes widely used in this day and age due to the convenience and effectiveness. It follows that, the traditional library has already started to be replaced by the digital one. Combining these two factors, a document based search engine that is able to query a digital document database with an input file is urgently needed. This thesis is a software development that aims to design and implement a prototype of such search engine, and propose latent optimization methods for Loredge. This research can be mainly divided into two categories: Prototype Development and Optimization Analysis. It involves an analytical research on sample documents provided by Loredge and a multi-perspective performance analysis. The prototype contains reading, preprocessing and similarity measurement. The reading part reads in a PDF file by using an imported Java library Apache PDFBox. The preprocessing processes the in-reading document and generates document fingerprint. The similarity measurement is the final stage that measures the similarity between the input fingerprint with all the document fingerprints in the database. The optimization analysis is to balance resource consumptions involving response time, accuracy rate and memory consumption. According to the performance analysis, the shorter the document fingerprint is, the better performance the search program presents. Moreover, a permanent feature database and a similarity based filtration mechanism are proposed to further optimize the program. This project has laid a solid foundation for further study in the document based search engine by providing a feasible prototype and enough relevant experimental data. This study figures out that the following study should mainly focuses on improving the effectiveness of the database access, which involves data entry labeling and search algorithm optimization. / Digital sökmotor, som tar en sökfråga från användaren och sedan returnerar ett resultat som svarar på den begäran tillbaka till användaren, är oumbärligt för moderna människor som brukar surfa på Internet. Å andra sidan, det digitala dokumentets format PDF accepteras av fler och fler människor, och det används i stor utsträckning i denna tidsålder på grund av bekvämlighet och effektivitet. Det följer att det traditionella biblioteket redan har börjat bytas ut av det digitala biblioteket. När dessa två faktorer kombineras, framgår det att det brådskande behövs en dokumentbaserad sökmotor, som har förmåga att fråga en digital databas om en viss fil. Den här uppsatsen är en mjukvaruutveckling som syftar till att designa och implementera en prototyp av en sådan sökmotor, och föreslå relevant optimeringsmetod för Loredge. Den här undersökningen kan huvudsakligen delas in i två kategorier, prototyputveckling och optimeringsanalys. Arbeten involverar en analytisk forskning om exempeldokument som kommer från Loredge och en prestandaanalys utifrån flera perspektiv. Prototypen innehåller läsning, förbehandling och likhetsmätning. Läsningsdelen läser in en PDF-fil med hjälp av en importerad Java bibliotek, Apache PDFBox. Förbehandlingsdelen bearbetar det inlästa dokumentet och genererar ett dokumentfingeravtryck. Likhetsmätningen är det sista steget, som mäter likheten mellan det inlästa fingeravtrycket och fingeravtryck av alla dokument i Loredge databas. Målet med optimeringsanalysen är att balansera resursförbrukningen, som involverar responstid, noggrannhet och minnesförbrukning. Ju kortare ett dokuments fingeravtryck är, desto bättre prestanda visar sökprogram enligt resultat av prestandaanalysen. Dessutom föreslås en permanent databas med fingeravtryck, och en likhetsbaserad filtreringsmekanism för att ytterligare optimera sökprogrammet. Det här projektet har lagt en solid grund för vidare studier om dokumentbaserad sökmotorn, genom att tillhandahålla en genomförbar prototyp och tillräckligt relevanta experimentella data. Den här studie visar att kommande forskning bör huvudsakligen inriktas på att förbättra effektivitet i databasåtkomsten, vilken innefattar data märkning och optimering av sökalgoritm.

Page generated in 0.5111 seconds