• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 2
  • 2
  • Tagged with
  • 15
  • 15
  • 15
  • 11
  • 7
  • 6
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Towards automatic grading of SQL queries

Venkatamuniyappa, Vijay Kumar January 1900 (has links)
Master of Science / Department of Computer Science / Doina Caragea / An Introduction to Databases course involves learning the concepts of data storage, manipulation, and retrieval. Relational databases provide an ideal learning path for understanding database concepts. The Structured Query Language (SQL) is a standard language for interacting with relational database. Each database vendor implements a variation of the SQL standard. Furthermore, a particular question that asks for some data can be written in many ways, using somewhat similar or structurally different SQL queries. Evaluation of SQL queries for correctness involves the verification of the SQL syntax and semantics, as well as verification of the output of queries and the usage of correct clauses. An evaluation tool should be independent of the specific database queried, and of the nature of the queries, and should allow multiple ways of providing input and retrieving the output. In this report, we have developed an evaluation tool for SQL queries, which checks for correctness of MySQL and PostgreSQL queries with the help of a parser that can identify SQL clauses. The tool developed will act as a portal for students to test and improve their queries, and finally to submit the queries for grading. The tool minimizes the manual effort required while grading, by taking advantage of the SQL parser to check queries for correctness, provide feedback, and allow submission.
2

Data-Driven Database Education: A Quantitative Study of SQL Learning in an Introductory Database Course

Von Dollen, Andrew C 01 July 2019 (has links)
The Structured Query Language (SQL) is widely used and challenging to master. Within the context of lab exercises in an introductory database course, this thesis analyzes the student learning process and seeks to answer the question: ``Which SQL concepts, or concept combinations, trouble students the most?'' We provide comprehensive taxonomies of SQL concepts and errors, identify common areas of student misunderstanding, and investigate the student problem-solving process. We present an interactive web application used by students to complete SQL lab exercises. In addition, we analyze data collected by this application and we offer suggestions for improvement to database lab activities.
3

Advanced Structured Query Language Instruction for Engineers of the Office of Information Technology at Brigham Young University

Rackliffe, Vincent Brian 01 December 2005 (has links) (PDF)
This report describes the purpose, design, development and analysis of SQLTips, an online instructional delivery framework and set of instructional modules relating to advanced features and performance tuning of Oracle's Structured Query Language (SQL). SQLTips was developed using Wiki, server-side software that allows users to edit web pages with almost any browser. The report includes a literature review of existing SQL instructional materials and a review of instructional theory. The report also includes a description of the formative evaluation process and results. These results show that SQLTips is easy and enjoyable to use. Based on a scale of 1 to 7 with 7 being the most positive, the 10 modules comprising SQLTips averaged a 6.1 for ease of use and a 6.2 for enjoyability. Posttest results also showed an average increase of 46% upon completion of the instruction. The report also contains a critique of the project.
4

Data Build Tool (DBT) Jobs in Hopsworks

Chen, Zidi January 2022 (has links)
Feature engineering at scale is always critical and challenging in the machine learning pipeline. Modern data warehouses enable data analysts to do feature engineering by transforming, validating and aggregating data in Structured Query Language (SQL). To help data analysts do this work, Data Build Tool (DBT), an open-source tool, was proposed to build and orchestrate SQL pipelines. Hopsworks, an open-source scalable feature store, would like to add support for DBT so that data scientists can do feature engineering in Python, Spark, Flink, and SQL in a single platform. This project aims to create a concept about how to build this support and then implement it. The project checks the feasibility of the solution using a sample DBT project. According to measurements, this working solution needs around 800 MB of space in the server and it takes more time than executing DBT commands locally. However, it persistently stores the results of each execution in HopsFS, which are available to users. By adding this novel support for SQL using DBT, Hopsworks might be one of the completest platforms for feature engineering so far. / Att utveckla funktioner i stor skala är alltid kritiskt och utmanande i pipeline för maskininlärning. Moderna datalager gör det möjligt för dataanalytiker att göra feature engineering genom att omvandla, validera och aggregera data i Structured Query Language (SQL). För att hjälpa dataanalytiker att utföra detta arbete föreslogs Data Build Tool (DBT), ett verktyg med öppen källkod, för att bygga och organisera SQL-pipelines. Hopsworks, ett skalbart funktionslager med öppen källkod, vill lägga till stöd för DBT så att datavetare kan göra funktionsutveckling i Python, Spark, Flink och SQL på en enda plattform. Det här projektet syftar till att skapa ett koncept för hur man bygger detta stöd och sedan genomföra det. Projektet kontrollerar lösningens genomförbarhet med hjälp av ett exempel på DBT-projekt. Enligt mätningar behöver denna fungerande lösning cirka 800 MB utrymme på servern och det tar mer tid än att utföra DBT-kommandon lokalt. Den lagrar dock permanent resultaten av varje körning i HopsFS, vilka är tillgängliga för användarna. Genom att lägga till detta nya stöd för SQL med DBT kan Hopsworks vara en av de mest kompletta plattformarna för funktionsutveckling hittills.
5

Application for data mining in manufacturing databases

Fang, Cheng-Hung January 1996 (has links)
No description available.
6

Static MySQL Error Checking

Zarinkhail, Mohammad Shuaib January 2010 (has links)
Masters of Science / Coders of databases repeatedly face the problem of checking their Structured Query Language (SQL) code. Instructors face the difficulty of checking student projects and lab assignments in database courses. We collect and categorize common MySQL programming errors into three groups: data definition errors, data manipulation errors, and transaction control errors. We build these into a comprehensive list of MySQL errors, which novices are inclined make during database programming. We collected our list of common MySQL errors both from the technical literature and directly by noting errors made in assignments handed in by students. In the results section of this research, we check and summarize occurrences of these errors based on three characteristics as semantics, syntax, and logic. These data form the basis of a future static MySQL checker that will eventually assist database coders to correct their code automatically. These errors also form a useful checklist to guide students away from the mistakes that they are prone to make.
7

A pattern-driven corpus to predictive analytics in mitigating SQL injection attack

Uwagbole, Solomon January 2018 (has links)
The back-end database provides accessible and structured storage for each web application's big data internet web traffic exchanges stemming from cloud-hosted web applications to the Internet of Things (IoT) smart devices in emerging computing. Structured Query Language Injection Attack (SQLIA) remains an intruder's exploit of choice to steal confidential information from the database of vulnerable front-end web applications with potentially damaging security ramifications. Existing solutions to SQLIA still follows the on-premise web applications server hosting concept which were primarily developed before the recent challenges of the big data mining and as such lack the functionality and ability to cope with new attack signatures concealed in a large volume of web requests. Also, most organisations' databases and services infrastructure no longer reside on-premise as internet cloud-hosted applications and services are increasingly used which limit existing Structured Query Language Injection (SQLI) detection and prevention approaches that rely on source code scanning. A bio-inspired approach such as Machine Learning (ML) predictive analytics provides functional and scalable mining for big data in the detection and prevention of SQLI in intercepting large volumes of web requests. Unfortunately, lack of availability of robust ready-made data set with patterns and historical data items to train a classifier are issues well known in SQLIA research applying ML in the field of Artificial Intelligence (AI). The purpose-built competition-driven test case data sets are antiquated and not pattern-driven to train a classifier for real-world application. Also, the web application types are so diverse to have an all-purpose generic data set for ML SQLIA mitigation. This thesis addresses the lack of pattern-driven data set by deriving one to predict SQLIA of any size and proposing a technique to obtain a data set on the fly and break the circle of relying on few outdated competitions-driven data sets which exist are not meant to benchmark real-world SQLIA mitigation. The thesis in its contributions derived pattern-driven data set of related member strings that are used in training a supervised learning model with validation through Receiver Operating Characteristic (ROC) curve and Confusion Matrix (CM) with results of low false positives and negatives. We further the evaluations with cross-validation to have obtained a low variance in accuracy that indicates of a successful trained model using the derived pattern-driven data set capable of generalisation of unknown data in the real-world with reduced biases. Also, we demonstrated a proof of concept with a test application by implementing an ML Predictive Analytics to SQLIA detection and prevention using this pattern-driven data set in a test web application. We observed in the experiments carried out in the course of this thesis, a data set of related member strings can be generated from a web expected input data and SQL tokens, including known SQLI signatures. The data set extraction ontology proposed in this thesis for applied ML in SQLIA mitigation in the context of emerging computing of big data internet, and cloud-hosted services set our proposal apart from existing approaches that were mostly on-premise source code scanning and queries structure comparisons of some sort.
8

Reduction Of Query Optimizer Plan Diagrams

Darera, Pooja N 12 1900 (has links)
Modern database systems use a query optimizer to identify the most efficient strategy, called "plan", to execute declarative SQL queries. Optimization is a mandatory exercise since the difference between the cost of best plan and a random choice could be in orders of magnitude. The role of query optimization is especially critical for the decision support queries featured in data warehousing and data mining applications. For a query on a given database and system configuration, the optimizer's plan choice is primarily a function of the selectivities of the base relations participating in the query. A pictorial enumeration of the execution plan choices of a database query optimizer over this relational selectivity space is called a "plan diagram". It has been shown recently that these diagrams are often remarkably complex and dense, with a large number of plans covering the space. An interesting research problem that immediately arises is whether complex plan diagrams can be reduced to a significantly smaller number of plans, without materially compromising the query processing quality. The motivation is that reduced plan diagrams provide several benefits, including quantifying the redundancy in the plan search space, enhancing the applicability of parametric query optimization, identifying error-resistant and least-expected-cost plans, and minimizing the overhead of multi-plan approaches. In this thesis, we investigate the plan diagram reduction issue from theoretical, statistical and empirical perspectives. Our analysis shows that optimal plan diagram reduction, w.r.t. minimizing the number of plans in the reduced diagram, is an NP-hard problem, and remains so even for a storage-constrained variation. We then present CostGreedy, a greedy reduction algorithm that has tight and optimal performance guarantees, and whose complexity scales linearly with the number of plans in the diagram. Next, we construct an extremely fast estimator, AmmEst, for identifying the location of the best tradeoff between the reduction in plan cardinality and the impact on query processing quality. Both CostGreedy and AmmEst have been incorporated in the publicly-available Picasso optimizer visualization tool. Through extensive experimentation with benchmark query templates on industrial-strength database optimizers, we demonstrate that with only a marginal increase in query processing costs, CostGreedy reduces even complex plan diagrams running to hundreds of plans to "anorexic" levels (small absolute number of plans). While these results are produced using a highly conservative upper-bounding of plan costs based on a cost monotonicity constraint, when the costing is done on "actuals" using remote plan costing, the reduction obtained is even greater - in fact, often resulting in a single plan in the reduced diagram. We also highlight how anorexic reduction provides enhanced resistance to selectivity estimate errors, a long-standing bane of good plan selection. In summary, this thesis demonstrates that complex plan diagrams can be efficiently converted to anorexic reduced diagrams, a result with useful implications for the design and use of next-generation database query optimizers.
9

Informační systém pro podporu řízení skladu, obchodu a marketingu / Information System for Management of Store and Support of Business and Marketing Operations

Ferencz, Erik January 2007 (has links)
This term project is about analyses and design of information system for  administration and managing business firm.System is designed as module system with unlimithed count of modules that coact or  are connected with another modules. Each module has its own data tables in database, own  Classes which make the middle layer of aplication  and graphical interface, but modules are not independent (one module can not work as a system).Iner communication among modules is based on database servers.Part of the application is its own database. To accomplish this project I had to familiarize myself with a problemathic of programming in  programming language C# and with database language PostgreSQL
10

Derby/S: A DBMS for Sample-Based Query Answering

Klein, Anja, Gemulla, Rainer, Rösch, Philipp, Lehner, Wolfgang 10 November 2022 (has links)
Although approximate query processing is a prominent way to cope with the requirements of data analysis applications, current database systems do not provide integrated and comprehensive support for these techniques. To improve this situation, we propose an SQL extension---called SQL/S---for approximate query answering using random samples, and present a prototypical implementation within the engine of the open-source database system Derby---called Derby/S. Our approach significantly reduces the required expert knowledge by enabling the definition of samples in a declarative way; the choice of the specific sampling scheme and its parametrization is left to the system. SQL/S introduces new DDL commands to easily define and administrate random samples subject to a given set of optimization criteria. Derby/S automatically takes care of sample maintenance if the underlying dataset changes. Finally, samples are transparently used during query processing, and error bounds are provided. Our extensions do not affect traditional queries and provide the means to integrate sampling as a first-class citizen into a DBMS.

Page generated in 0.0726 seconds