Global ETD Search

1	ORION : uma abordagem eficaz e robusta para aquisição de valores de atributos de entidades do mundo real / ORION: an effective and robust approach for acquiring attribute values of real-world entities Manica, Edimar January 2017 (has links) Página-entidade é uma página Web que publica dados que descrevem uma entidade de um tipo particular. Adquirir os valores dos atributos de entidades do mundo real publicados nessas páginas é uma tarefa estratégia para diversas empresas. Essa aquisição envolve as tarefas de encontrar as páginas-entidade nos sites e extrair os valores dos atributos publicados nessas páginas. Os trabalhos que discorrem sobre como realizar as tarefas de descoberta das páginasentidade e de extração dos dados de forma integrada possuem aplicação limitada porque são específicos para um domínio de aplicação ou porque requerem anotações a priori. Tendo em vista essa lacuna, esta Tese apresenta Orion, uma abordagem para aquisição de valores de atributos de entidades do mundo real a partir de páginas-entidade baseadas em template. Orion descobre as páginas-entidade nos sites e extrai os valores dos atributos publicados nessas páginas. A principal originalidade da abordagem Orion é realizar as tarefas de descoberta das páginas-entidade e de extração dos dados de forma integrada, independentemente de domínio de aplicação e de anotação a priori. A abordagem Orion inclui uma etapa de descoberta de páginas-entidade que combina características de HTML e URL sem a necessidade de intervenção do usuário para definição dos limiares de similaridade entre as páginas. A etapa de descoberta utiliza uma nova função de similaridade entre páginas baseada na URL que atribui diferentes pesos para os termos de URL de acordo com a capacidade de distinção de páginas-entidade das demais páginas. A abordagem Orion também inclui uma etapa de extração de valores de atributos a partir de consultas Cypher em um banco de dados orientado a grafos. Essa etapa infere as consultas automaticamente. A abordagem Orion é robusta porque inclui uma etapa adicional de reforço que realiza o tratamento de atributos com variação de template. Esse reforço é realizado por meio de uma combinação linear de diferentes funções de similaridade. A fim de avaliar a eficácia de cada etapa da abordagem isoladamente e da abordagem de forma integral, foram realizados experimentos exaustivos utilizando sites reais. Nesses experimentos, a abordagem Orion foi numérica e estatisticamente mais eficaz que os baselines. / Entity-page is a Web page which publishes data that describe an entity of a specific type. Acquiring the attribute values of the real-world entities that are published in these pages is a strategic task for various companies. This acquisition involves the tasks of discovering the entitypages in the websites and extracting the attribute values that are published in them. However, the current approaches that carry out the tasks of discovering entity-pages and extracting data in an integrated way have limited applications because they are restricted to a particular application domain or require an a priori annotation. This thesis presents Orion, which is an approach to acquire the attribute values of real-world entities from template-based entity-pages. Orion discovers the entity-pages in the websites and extracts the attribute values that are published in them. What is original about the Orion approach is that it carries out the tasks of discovering entity-pages and extracting data in a way that is integrated, domain-independent, and independent of any a priori annotation. The Orion approach includes an entity-page discovery stage that combines the HTML and URL features without requiring the user to define the similarity threshold between the pages. The discovery stage employs a new URL-based similarity function that assigns different weights to the URL terms in accordance with their capacity to distinguish entity-pages from other pages. Orion also includes a stage during which the attribute values are extracted by means of Cypher queries in a graph database. This stage automatically induces the queries. It should be noted that the Orion approach is robust because it includes an additional reinforcement stage for handling attributes with template variations. This stage involves exploring a linear combination of different similarity functions. We carried out exhaustive experiments through real-world websites with the aim of evaluating the effectiveness of each stage of the approach both in isolation and in an integrated manner. It was found that the Orion approach was numerically and statistically more effective than the baselines. Banco de dados Banco : Dados orientados : Objetos
2	ORION : uma abordagem eficaz e robusta para aquisição de valores de atributos de entidades do mundo real / ORION: an effective and robust approach for acquiring attribute values of real-world entities Manica, Edimar January 2017 (has links) Página-entidade é uma página Web que publica dados que descrevem uma entidade de um tipo particular. Adquirir os valores dos atributos de entidades do mundo real publicados nessas páginas é uma tarefa estratégia para diversas empresas. Essa aquisição envolve as tarefas de encontrar as páginas-entidade nos sites e extrair os valores dos atributos publicados nessas páginas. Os trabalhos que discorrem sobre como realizar as tarefas de descoberta das páginasentidade e de extração dos dados de forma integrada possuem aplicação limitada porque são específicos para um domínio de aplicação ou porque requerem anotações a priori. Tendo em vista essa lacuna, esta Tese apresenta Orion, uma abordagem para aquisição de valores de atributos de entidades do mundo real a partir de páginas-entidade baseadas em template. Orion descobre as páginas-entidade nos sites e extrai os valores dos atributos publicados nessas páginas. A principal originalidade da abordagem Orion é realizar as tarefas de descoberta das páginas-entidade e de extração dos dados de forma integrada, independentemente de domínio de aplicação e de anotação a priori. A abordagem Orion inclui uma etapa de descoberta de páginas-entidade que combina características de HTML e URL sem a necessidade de intervenção do usuário para definição dos limiares de similaridade entre as páginas. A etapa de descoberta utiliza uma nova função de similaridade entre páginas baseada na URL que atribui diferentes pesos para os termos de URL de acordo com a capacidade de distinção de páginas-entidade das demais páginas. A abordagem Orion também inclui uma etapa de extração de valores de atributos a partir de consultas Cypher em um banco de dados orientado a grafos. Essa etapa infere as consultas automaticamente. A abordagem Orion é robusta porque inclui uma etapa adicional de reforço que realiza o tratamento de atributos com variação de template. Esse reforço é realizado por meio de uma combinação linear de diferentes funções de similaridade. A fim de avaliar a eficácia de cada etapa da abordagem isoladamente e da abordagem de forma integral, foram realizados experimentos exaustivos utilizando sites reais. Nesses experimentos, a abordagem Orion foi numérica e estatisticamente mais eficaz que os baselines. / Entity-page is a Web page which publishes data that describe an entity of a specific type. Acquiring the attribute values of the real-world entities that are published in these pages is a strategic task for various companies. This acquisition involves the tasks of discovering the entitypages in the websites and extracting the attribute values that are published in them. However, the current approaches that carry out the tasks of discovering entity-pages and extracting data in an integrated way have limited applications because they are restricted to a particular application domain or require an a priori annotation. This thesis presents Orion, which is an approach to acquire the attribute values of real-world entities from template-based entity-pages. Orion discovers the entity-pages in the websites and extracts the attribute values that are published in them. What is original about the Orion approach is that it carries out the tasks of discovering entity-pages and extracting data in a way that is integrated, domain-independent, and independent of any a priori annotation. The Orion approach includes an entity-page discovery stage that combines the HTML and URL features without requiring the user to define the similarity threshold between the pages. The discovery stage employs a new URL-based similarity function that assigns different weights to the URL terms in accordance with their capacity to distinguish entity-pages from other pages. Orion also includes a stage during which the attribute values are extracted by means of Cypher queries in a graph database. This stage automatically induces the queries. It should be noted that the Orion approach is robust because it includes an additional reinforcement stage for handling attributes with template variations. This stage involves exploring a linear combination of different similarity functions. We carried out exhaustive experiments through real-world websites with the aim of evaluating the effectiveness of each stage of the approach both in isolation and in an integrated manner. It was found that the Orion approach was numerically and statistically more effective than the baselines. Banco de dados Banco : Dados orientados : Objetos
3	Extensão do padrão ODMG para suportar tempo e versões Gelatti, Paôla Cristina January 2002 (has links) Este trabalho apresenta uma extensão do padrão ODMG para o suporte ao versionamento de objetos e características temporais. Essa extensão, denominada TV_ODMG, é baseada no Modelo Temporal de Versões (TVM), que é um modelo de dados orientado a objetos desenvolvido para armazenar as versões do objeto e, para cada versão, o histórico dos valores dos atributos e dos relacionamentos dinâmicos. O TVM difere de outros modelos de dados temporais por apresentar duas diferentes ordens de tempo, ramificado para o objeto e linear para cada versão. O usuário pode também especificar, durante a modelagem, classes normais (sem tempo e versões), o que permite a integração desse modelo com outras modelagens existentes. Neste trabalho, os seguintes componentes da arquitetura do padrão ODMG foram estendidos: o Modelo de Objetos, a ODL (Object Definition Language) e a OQL (Object Query Language). Adicionalmente, foi desenvolvido um conjunto de regras para o mapeamento do TV_ODMG para o ODMG a fim de permitir o uso de qualquer ODBMS para suportar a extensão proposta. Banco : Dados orientados : Objetos Versoes : Banco : Dados
4	Extensão do padrão ODMG para suportar tempo e versões Gelatti, Paôla Cristina January 2002 (has links) Este trabalho apresenta uma extensão do padrão ODMG para o suporte ao versionamento de objetos e características temporais. Essa extensão, denominada TV_ODMG, é baseada no Modelo Temporal de Versões (TVM), que é um modelo de dados orientado a objetos desenvolvido para armazenar as versões do objeto e, para cada versão, o histórico dos valores dos atributos e dos relacionamentos dinâmicos. O TVM difere de outros modelos de dados temporais por apresentar duas diferentes ordens de tempo, ramificado para o objeto e linear para cada versão. O usuário pode também especificar, durante a modelagem, classes normais (sem tempo e versões), o que permite a integração desse modelo com outras modelagens existentes. Neste trabalho, os seguintes componentes da arquitetura do padrão ODMG foram estendidos: o Modelo de Objetos, a ODL (Object Definition Language) e a OQL (Object Query Language). Adicionalmente, foi desenvolvido um conjunto de regras para o mapeamento do TV_ODMG para o ODMG a fim de permitir o uso de qualquer ODBMS para suportar a extensão proposta. Banco : Dados orientados : Objetos Versoes : Banco : Dados
5	ORION : uma abordagem eficaz e robusta para aquisição de valores de atributos de entidades do mundo real / ORION: an effective and robust approach for acquiring attribute values of real-world entities Manica, Edimar January 2017 (has links) Página-entidade é uma página Web que publica dados que descrevem uma entidade de um tipo particular. Adquirir os valores dos atributos de entidades do mundo real publicados nessas páginas é uma tarefa estratégia para diversas empresas. Essa aquisição envolve as tarefas de encontrar as páginas-entidade nos sites e extrair os valores dos atributos publicados nessas páginas. Os trabalhos que discorrem sobre como realizar as tarefas de descoberta das páginasentidade e de extração dos dados de forma integrada possuem aplicação limitada porque são específicos para um domínio de aplicação ou porque requerem anotações a priori. Tendo em vista essa lacuna, esta Tese apresenta Orion, uma abordagem para aquisição de valores de atributos de entidades do mundo real a partir de páginas-entidade baseadas em template. Orion descobre as páginas-entidade nos sites e extrai os valores dos atributos publicados nessas páginas. A principal originalidade da abordagem Orion é realizar as tarefas de descoberta das páginas-entidade e de extração dos dados de forma integrada, independentemente de domínio de aplicação e de anotação a priori. A abordagem Orion inclui uma etapa de descoberta de páginas-entidade que combina características de HTML e URL sem a necessidade de intervenção do usuário para definição dos limiares de similaridade entre as páginas. A etapa de descoberta utiliza uma nova função de similaridade entre páginas baseada na URL que atribui diferentes pesos para os termos de URL de acordo com a capacidade de distinção de páginas-entidade das demais páginas. A abordagem Orion também inclui uma etapa de extração de valores de atributos a partir de consultas Cypher em um banco de dados orientado a grafos. Essa etapa infere as consultas automaticamente. A abordagem Orion é robusta porque inclui uma etapa adicional de reforço que realiza o tratamento de atributos com variação de template. Esse reforço é realizado por meio de uma combinação linear de diferentes funções de similaridade. A fim de avaliar a eficácia de cada etapa da abordagem isoladamente e da abordagem de forma integral, foram realizados experimentos exaustivos utilizando sites reais. Nesses experimentos, a abordagem Orion foi numérica e estatisticamente mais eficaz que os baselines. / Entity-page is a Web page which publishes data that describe an entity of a specific type. Acquiring the attribute values of the real-world entities that are published in these pages is a strategic task for various companies. This acquisition involves the tasks of discovering the entitypages in the websites and extracting the attribute values that are published in them. However, the current approaches that carry out the tasks of discovering entity-pages and extracting data in an integrated way have limited applications because they are restricted to a particular application domain or require an a priori annotation. This thesis presents Orion, which is an approach to acquire the attribute values of real-world entities from template-based entity-pages. Orion discovers the entity-pages in the websites and extracts the attribute values that are published in them. What is original about the Orion approach is that it carries out the tasks of discovering entity-pages and extracting data in a way that is integrated, domain-independent, and independent of any a priori annotation. The Orion approach includes an entity-page discovery stage that combines the HTML and URL features without requiring the user to define the similarity threshold between the pages. The discovery stage employs a new URL-based similarity function that assigns different weights to the URL terms in accordance with their capacity to distinguish entity-pages from other pages. Orion also includes a stage during which the attribute values are extracted by means of Cypher queries in a graph database. This stage automatically induces the queries. It should be noted that the Orion approach is robust because it includes an additional reinforcement stage for handling attributes with template variations. This stage involves exploring a linear combination of different similarity functions. We carried out exhaustive experiments through real-world websites with the aim of evaluating the effectiveness of each stage of the approach both in isolation and in an integrated manner. It was found that the Orion approach was numerically and statistically more effective than the baselines. Banco de dados Banco : Dados orientados : Objetos
6	Extensão do padrão ODMG para suportar tempo e versões Gelatti, Paôla Cristina January 2002 (has links) Este trabalho apresenta uma extensão do padrão ODMG para o suporte ao versionamento de objetos e características temporais. Essa extensão, denominada TV_ODMG, é baseada no Modelo Temporal de Versões (TVM), que é um modelo de dados orientado a objetos desenvolvido para armazenar as versões do objeto e, para cada versão, o histórico dos valores dos atributos e dos relacionamentos dinâmicos. O TVM difere de outros modelos de dados temporais por apresentar duas diferentes ordens de tempo, ramificado para o objeto e linear para cada versão. O usuário pode também especificar, durante a modelagem, classes normais (sem tempo e versões), o que permite a integração desse modelo com outras modelagens existentes. Neste trabalho, os seguintes componentes da arquitetura do padrão ODMG foram estendidos: o Modelo de Objetos, a ODL (Object Definition Language) e a OQL (Object Query Language). Adicionalmente, foi desenvolvido um conjunto de regras para o mapeamento do TV_ODMG para o ODMG a fim de permitir o uso de qualquer ODBMS para suportar a extensão proposta. Banco : Dados orientados : Objetos Versoes : Banco : Dados
7	Database recovery in the design environment : requirements analysis and performance evaluation Iochpe, Cirano January 1989 (has links) In the pastfew years, considerable research effort has been spent on data models, processing mo deis, and system architectures for supporting advanced applications Uke CADICAM, software engineering, image processing, and knowledge management. These so-called non-standard applications pose new requirements on database systems. Conventional database systems (i.e. database systems constructed to support businessrelated applications) either cope with the new requirements only in an unsatisfactory way or do not cope with them at ali. Examples ofsuch new requirements are the need of more powerful data models which enable the definition as well as manipulation offairly structured data objects and the requirement of new processing models which better support long-time data manipulation as well as allow database system users to exchange noncommitted results. To better support new data and processing models, new database systems have been proposed and developed which realize object-oriented data models that in turn support the definition and operation of both complex object structures and object behavior. In design environments as the ones represented by CAD applications, these so-called non-standard database systems are usually distributed over server-workstations computer configwations. While actual object versions are kept in the so-called public database on server, designers create new objects as well as new object versions in their private databases which are maintained by the system at the workstations. Besides that, many new design database system prototypes realize a hierarchy of system buffers to accelerate data processing at the system s application level. While the storage subsystem implements the traditional page/segment buffer to reduce the number of I/O-operations between main memory and diste, data objects are processed by application programs at the workstatíon at higher leveis ofabstraction and the objects are kept there by so-called object-oriented buffer managers in special main memory representations. The present dissertation reports on the investigation of database recovery requirements and database recovery performance in design environments. The term design environment is used here to characterize those data processing environments which support so-called design applications (e.g. CADICAM, software engineering). The dissertation begins by analyzing the conanon architectural characteristics of a set of new design database system prototypes. After proposing a reference architecture for those systems, we investigate the properties of a set ofwell known design processing models which can be found in the literature. Relying on both the reference architecture and the characteristics of design processing models, the dissertation presents a thorough study of recovery requirements in the design environment. Then, the possibility ofadapting existing recovery techniques to maintain system reliability in design database systems is investigated. Finally, the dissertation reports on a recovery performance evaluation involving several existing as well as new recovery mechanisms. The simulation model used in the performance analysis is described and the simulation results are presented. Banco : Dados Banco : Dados orientados : Objetos Recuperacao : Erros
8	Database recovery in the design environment : requirements analysis and performance evaluation Iochpe, Cirano January 1989 (has links) In the pastfew years, considerable research effort has been spent on data models, processing mo deis, and system architectures for supporting advanced applications Uke CADICAM, software engineering, image processing, and knowledge management. These so-called non-standard applications pose new requirements on database systems. Conventional database systems (i.e. database systems constructed to support businessrelated applications) either cope with the new requirements only in an unsatisfactory way or do not cope with them at ali. Examples ofsuch new requirements are the need of more powerful data models which enable the definition as well as manipulation offairly structured data objects and the requirement of new processing models which better support long-time data manipulation as well as allow database system users to exchange noncommitted results. To better support new data and processing models, new database systems have been proposed and developed which realize object-oriented data models that in turn support the definition and operation of both complex object structures and object behavior. In design environments as the ones represented by CAD applications, these so-called non-standard database systems are usually distributed over server-workstations computer configwations. While actual object versions are kept in the so-called public database on server, designers create new objects as well as new object versions in their private databases which are maintained by the system at the workstations. Besides that, many new design database system prototypes realize a hierarchy of system buffers to accelerate data processing at the system s application level. While the storage subsystem implements the traditional page/segment buffer to reduce the number of I/O-operations between main memory and diste, data objects are processed by application programs at the workstatíon at higher leveis ofabstraction and the objects are kept there by so-called object-oriented buffer managers in special main memory representations. The present dissertation reports on the investigation of database recovery requirements and database recovery performance in design environments. The term design environment is used here to characterize those data processing environments which support so-called design applications (e.g. CADICAM, software engineering). The dissertation begins by analyzing the conanon architectural characteristics of a set of new design database system prototypes. After proposing a reference architecture for those systems, we investigate the properties of a set ofwell known design processing models which can be found in the literature. Relying on both the reference architecture and the characteristics of design processing models, the dissertation presents a thorough study of recovery requirements in the design environment. Then, the possibility ofadapting existing recovery techniques to maintain system reliability in design database systems is investigated. Finally, the dissertation reports on a recovery performance evaluation involving several existing as well as new recovery mechanisms. The simulation model used in the performance analysis is described and the simulation results are presented. Banco : Dados Banco : Dados orientados : Objetos Recuperacao : Erros
9	Database recovery in the design environment : requirements analysis and performance evaluation Iochpe, Cirano January 1989 (has links) In the pastfew years, considerable research effort has been spent on data models, processing mo deis, and system architectures for supporting advanced applications Uke CADICAM, software engineering, image processing, and knowledge management. These so-called non-standard applications pose new requirements on database systems. Conventional database systems (i.e. database systems constructed to support businessrelated applications) either cope with the new requirements only in an unsatisfactory way or do not cope with them at ali. Examples ofsuch new requirements are the need of more powerful data models which enable the definition as well as manipulation offairly structured data objects and the requirement of new processing models which better support long-time data manipulation as well as allow database system users to exchange noncommitted results. To better support new data and processing models, new database systems have been proposed and developed which realize object-oriented data models that in turn support the definition and operation of both complex object structures and object behavior. In design environments as the ones represented by CAD applications, these so-called non-standard database systems are usually distributed over server-workstations computer configwations. While actual object versions are kept in the so-called public database on server, designers create new objects as well as new object versions in their private databases which are maintained by the system at the workstations. Besides that, many new design database system prototypes realize a hierarchy of system buffers to accelerate data processing at the system s application level. While the storage subsystem implements the traditional page/segment buffer to reduce the number of I/O-operations between main memory and diste, data objects are processed by application programs at the workstatíon at higher leveis ofabstraction and the objects are kept there by so-called object-oriented buffer managers in special main memory representations. The present dissertation reports on the investigation of database recovery requirements and database recovery performance in design environments. The term design environment is used here to characterize those data processing environments which support so-called design applications (e.g. CADICAM, software engineering). The dissertation begins by analyzing the conanon architectural characteristics of a set of new design database system prototypes. After proposing a reference architecture for those systems, we investigate the properties of a set ofwell known design processing models which can be found in the literature. Relying on both the reference architecture and the characteristics of design processing models, the dissertation presents a thorough study of recovery requirements in the design environment. Then, the possibility ofadapting existing recovery techniques to maintain system reliability in design database systems is investigated. Finally, the dissertation reports on a recovery performance evaluation involving several existing as well as new recovery mechanisms. The simulation model used in the performance analysis is described and the simulation results are presented. Banco : Dados Banco : Dados orientados : Objetos Recuperacao : Erros
10	Uma interface visual para modelos de bancos de dados orientados a objetos com suporte para versões Silva, Juliano Tonezer da January 1998 (has links) Este trabalho apresenta o projeto de uma interface visual para modelos de bancos de dados orientados a objetos, com suporte para versões. Um requisito importante, não atendido pelas interfaces visuais específicas e genéricas para sistemas orientados a objetos, é a capacidade de definir e manipular versões de um objeto nos vários níveis da hierarquia de classes (herança por extensão, adotada pelo modelo de versões [GOL 95]). As interfaces, que manipulam versões, suportam essa característica no nível mais especializado da hierarquia (herança por refinamento, adotada pelos principais SGBDOOs). Procurando prover a possibilidade do versionamento de objetos nos vários níveis da hierarquia de classes, surgiu a motivação para projetar e desenvolver uma interface visual com funcionalidades de interfaces existentes (específicas e genéricas) e que obedeça às características principais dos Modelos de Dados Orientados a Objetos e do Modelo de Versões [GOL 95], seguindo as características recomendadas para interfaces visuais para MDOOs, propostas em [SIL 96]. Foi implementado um protótipo com algumas das características projetadas para o browser de objeto e seu suporte para versões. Banco : Dados Banco : Dados orientados : Objetos Interface : Usuario Versoes : Banco : Dados Interface visual

Search results