Spelling suggestions: "subject:"regex."" "subject:"legex.""
1 |
Exploring the Process and Challenges of Programming with Regular ExpressionsMichael, Louis Guy IV 27 June 2019 (has links)
Regular expressions (regexes) are a powerful mechanism for solving string-matching problems and are supported by all modern programming languages. While regular expressions are highly expressive, they are also often perceived to be highly complex and hard to read. While existing studies have focused on improving the readability of regular expressions, little is known about any other difficulties that developers face when programming with regular expressions. In this paper, we aim to provide a deeper understanding of the process of programming regular expressions by studying: (1) how developers make decisions through the process, (2) what difficulties they face, and (3) how aware they are about serious risks involved in programming regexes. We surveyed 158 professional developers from a diversity of backgrounds, and we conducted a series of interviews to learn more details about the difficulties and solutions that participants face in this process. This mixed methods approach revealed that some of the difficulties of regexes come in the shape of: inability to effectively search for them; fully validate them; and document them. Developers also reported cascading impacts of poor readability, lack of universal portability, and struggling with overall problem comprehension. The majority of our studied developers were unaware of critical security risks that can occur when using regexes, and those that were aware of potential problems felt that they lacked the ability to identify problematic regexes. Our findings provide multiple implications for future work, including development of semantic regex search engines for regex reuse, and improved input generators for regex validation. / Master of Science / Regular expressions (regexes) are a method to search for a set of matching text. They are easily understood as a way to flexibly search beyond exact matching and are frequently seen in the capacity of the find functionality of ctrl-f. Regexes are also very common in source code for a range of tasks including form validation, where a program needs to confirm that a user provided information that conformed to a specific structure, such as an email address. Despite being a widely supported programming feature, little is known about how developers go about creating regexes or what they struggle with when doing so. To gain a better understanding of how regexes are created and reused, we surveyed 158 professional developers from a diversity of backgrounds and experience levels about their processes and perceptions about regexes. As a followup to the survey we conducted a series of interviews focusing on the challenges faced by developers when tackling problems for which they felt that a regex was worth using. Through the combination of these studies, we developed a detailed understanding of how professional developers create regexes as well as many of the struggles that they face when doing so. These challenges come in the form of the inability to effectively search for, fully validate, and document regexes, as well as the cascading impacts of poor readability, lack of universal portability, and overall problem comprehension.
|
2 |
A Web-based System for Publishing Publications to Both Växjö University Library and RICSul haq, Israr January 2009 (has links)
The Department of Computer Science of Växjö University has a web based system where thesis finished by the students as well as available thesis can be seen. Research work made by the faculty members of the Department is also available here. The Växjö University Library also has a web- based System called DiVA. DiVA stands for Digitala Vetenskapliga Arkivet (Academic Archive Online). It enables students and researchers to publish their research work to University Library. Students and Researchers can also search, research made by other researchers. Each year researcher/student publishes their result in conference papers, journal articles, thesis report, books etc. These publications should be register both at the Växjö University Library and RICS (Research in Computer Science) web site in a systematic way so to avoid any kind of redundancy and errors. To develop a system which follow the principle” publish once and view every where” is the objective of this project. The System makes it possible to extract the already published publication at DiVA and put that publication’s information into RICS web site. When a user requests for registering a publication, the system should verify whether he/she is eligible to register a publication. If he/she qualifies, that is if he/she is a registered user, then the system will register the publication along with the required information like title, author, date of publication, kind of publication etc. on different Web sites. The web-based publication information system was implemented in C#.NET. The project was successfully completed but the deliver system requires more live testing.
|
3 |
Software Architecture CheckerBahtiyar, Muhammed Yasin January 2008 (has links)
<p>By the increasing needs of software industry, software systems became more complex constructions than ever before. As a result of increasing complexity in software systems, functional decomposition of these systems gains the status of the most important aspect in the software development process. Dividing problems to sub-problems and producing specific solutions for divided parts makes it easier to solve the main problem.</p><p>Component Based Software Engineering is a way of developing software systems that consists of logically or functionally decomposed components which integrated to each other by the help of well-defined interfaces. CBSE relies on architectural design of a software system.</p><p>Planning phase and implementation of a software project may differ time to time. Because of the complexity of software systems, solving specific problems may affect the architecture of the whole system.</p><p>In spite of sophisticated software engineering processes and CASE tools there is still a large gap between the planned and implemented architecture of software systems. Finding deviations from architecture in source code is a non-trivial task requiring tool support.</p><p>Since, matching operation of designed software architecture and implemented software architecture needs to check design documents against implementation code. This manual checking operation is nearly impossible for major software systems. Software Architecture Checker provides a great approach to check the architecture of any software system.</p><p>This bachelor thesis examines the approach behind the Software Architecture Checker.</p>
|
4 |
A Web-based System for Publishing Publications to Both Växjö University Library and RICSul haq, Israr January 2009 (has links)
<p>The Department of Computer Science of Växjö University has a web based system where thesis finished by the students as well as available thesis can be seen. Research work made by the faculty members of the Department is also available here.</p><p>The Växjö University Library also has a web- based System called DiVA. DiVA stands for Digitala Vetenskapliga Arkivet (Academic Archive Online). It enables students and researchers to publish their research work to University Library. Students and Researchers can also search, research made by other researchers.</p><p>Each year researcher/student publishes their result in conference papers, journal articles, thesis report, books etc. These publications should be register both at the Växjö University Library and RICS (Research in Computer Science) web site in a systematic way so to avoid any kind of redundancy and errors. To develop a system which follow the principle” publish once and view every where” is the objective of this project.</p><p>The System makes it possible to extract the already published publication at DiVA and put that publication’s information into RICS web site.</p><p>When a user requests for registering a publication, the system should verify whether he/she is eligible to register a publication. If he/she qualifies, that is if he/she is a registered user, then the system will register the publication along with the required information like title, author, date of publication, kind of publication etc. on different Web sites.</p><p>The web-based publication information system was implemented in C#.NET.</p><p>The project was successfully completed but the deliver system requires more live testing.</p>
|
5 |
Software Architecture CheckerBahtiyar, Muhammed Yasin January 2008 (has links)
By the increasing needs of software industry, software systems became more complex constructions than ever before. As a result of increasing complexity in software systems, functional decomposition of these systems gains the status of the most important aspect in the software development process. Dividing problems to sub-problems and producing specific solutions for divided parts makes it easier to solve the main problem. Component Based Software Engineering is a way of developing software systems that consists of logically or functionally decomposed components which integrated to each other by the help of well-defined interfaces. CBSE relies on architectural design of a software system. Planning phase and implementation of a software project may differ time to time. Because of the complexity of software systems, solving specific problems may affect the architecture of the whole system. In spite of sophisticated software engineering processes and CASE tools there is still a large gap between the planned and implemented architecture of software systems. Finding deviations from architecture in source code is a non-trivial task requiring tool support. Since, matching operation of designed software architecture and implemented software architecture needs to check design documents against implementation code. This manual checking operation is nearly impossible for major software systems. Software Architecture Checker provides a great approach to check the architecture of any software system. This bachelor thesis examines the approach behind the Software Architecture Checker.
|
6 |
Preventing SQL Injections by Hashing the Query Parameter DataLokby, Patrik, Jönsson, Manfred January 2017 (has links)
Context. Many applications today use databases to store user informationor other data for their applications. This information can beaccessed through various different languages depending on what typeof database it is. Databases that use SQL can maliciously be exploitedwith SQL injection attacks. This type of attack involves inserting SQLcode in the query parameter. The injected code sent from the clientwill then be executed on the database. This can lead to unauthorizedaccess to data or other modifications within the database. Objectives. In this study we investigate if a system can be builtwhich prevents SQL injection attacks from succeeding on web applicationsthat is connected with a MySQL database. In the intendedmodel, a proxy is placed between the web server and the database.The purpose of the proxy is to hash the SQL query parameter dataand remove any characters that the database will interpret as commentsyntax. By processing each query before it reaches its destination webelieve we can prevent vulnerable SQL injection points from being exploited. Methods. A literary study is conducted the gain the knowledgeneeded to accomplish the objectives for this thesis. A proxy is developedand tested within a system containing a web server and database.The tests are analyzed to arrive at a conclusion that answers ours researchquestions. Results. Six tests are conducted which includes detection of vulnerableSQL injection points and the delay difference on the system withand without the proxy. The result is presented and analyzed in thethesis. Conclusions. We conclude that the proxy prevents SQL injectionpoints to be vulnerable on the web application. Vulnerable SQL injectionpoints is still reported even with the proxy deployed in thesystem. The web server is able to process more http requests that requiresa database query when the proxy is not used within the system.More studies are required since there is still vulnerable SQL injectionspoints.
|
7 |
Comparing the Performance of Compiled vs Interpreted RegEx / Jämnförelse av prestandan mellan kompilerat och tolkat RegExHocker, Simon, Hammarstrand, Andreas January 2023 (has links)
The Regular Expression (RegEx) is one of the most important computer science technologies used for searching through text. Used commonly in almost every corner of computer science that is dependent on searching, it is imperative that they are made to be efficient. Usually, RegEx are implemented through the use of a process called interpretation. This thesis explores the possibility and execution time benefits of compiling the RegEx as part of the program instead of interpreting it. For this purpose, a prototype implementation was developed in the Rust programming language. Using this prototype, execution time benchmarks were performed that compare the optimised, and commonly used, interpreted variant against the thesis’ unoptimised compiled version. While the results did not determine a clear preferred method in terms of execution time, they did highlight the potential that exists in compiling RegEx. With some of the tests showing faster execution times in the prototype, there are strong arguments for future research into this field, where the compilation of RegEx can come to benefit from the optimisations present in the interpreted norm. / Regulära uttryck (EN: Regular Expression; RegEx) är en av de mest använda datalogiteknikerna för att söka igenom text. Eftersom det är använt inom många delar av datalogi så är teknikens effektivitet viktig. I norm är RegEx genomförda med en process kallad tolkning. Denna uppsats utforskar möjligheten och tidsförmåner att kompilera dessa RegEx som en del av det utomliggande programmet istället för att tolka det. För det syftet skapades en prototyp i programmeringsspråket Rust. Denna prototyp användes då för att utföra tidstest där den optimerade tolkade normen jämnfördes med avhandlingens kompilerade icke optimerade variant. De producerade resultaten visade ingen föredragen metod men betonade möjligheterna med att kompilera RegEx. Eftersom vissa av testerna visade snabbare utförande med prototypen finns det starka argument för ytterligare forskning inom detta område. På så sätt kan den kompilerade formen ta del av den utveckling som den tolkade normen redan har.
|
8 |
Automatic de-identification of case narratives from spontaneous reports in VigiBaseSahlström, Jakob January 2015 (has links)
The use of patient data is essential in research but it is on the other hand confidential and can only be used after acquiring approval from an Ethical Board and informed consent from the individual patient. A large amount of patient data is therefore difficult to obtain if sensitive information, such as names, id numbers and contact details, are not removed from the data, by so called de-identification. Uppsala Monitoring Centre maintains the world's larges database of individual case reports of any suspected adverse drug reaction. There exists, of today, no method for efficiently de-identifying the narrative text included in these which causes countries like the United States of America reports to exclude the narratives in the reports. The aim of this thesis is to develop and evaluate a method for automatic de-identification of case narratives in reports from the WHO Global Individual Case Safety Report Database System, VigiBase. This report compares three different models, namely Regular Expressions, used for text pattern matching, and the statistical models Support Vector Machine (SVM) and Conditional Random Fields (CRF). Performance, advantages and disadvantages are discussed as well as how identified sensitive information is handled to maintain readability of the narrative text. The models developed in this thesis are also compared to existing solutions to the de-identification problem. The 400 reports extracted from VigiBase were already well de-identified in terms of names, ID numbers and contact details, making it difficult to train statistical models on these categories. The reports did however, contain plenty of dates and ages. For these categories Regular Expression would be sufficient to achieve a good performance. To identify entities in other categories more advanced methods such as the SVM and CRF are needed and will require more data. This was prominent when applying the models on the more information rich i2b2 de-identification challenge benchmark data set where the statistical models developed in this thesis performed at a competing level with existing models in the literature.
|
9 |
Rychlejší než grep pomocí čítačů / Beat Grep with Counters, ChallengeHorký, Michal January 2021 (has links)
Vyhledávání regulárních výrazů má ve vývoji softwaru nezastupitelné místo. Rychlost vyhledávání může ovlivnit použitelnost softwaru, a proto je na ni kladen velký důraz. Pro určité druhy regulárních výrazů mají standardní přístupy pro vyhledávání vysokou složitost. Kvůli tomu jsou náchylné k útokům založeným na vysoké náročnosti vyhledávání regulárních výrazů (takzvané ReDoS útoky). Regulární výrazy s omezeným opakováním, které se v praxi často vyskytují, jsou jedním z těchto druhů. Efektivní reprezentace a rychlé vyhledávání těchto regulárních výrazů je možné s použítím automatu s čítači. V této práci představujeme implementaci vyhledávání regulárních výrazů založeném na automatech s čítači v C++. Vyhledávání je implementováno v rámci RE2, rychlé moderní knihovny pro vyhledávání regulárních výrazů. V práci jsme provedli experimenty na v praxi používaných regulárních výrazech. Výsledky experimentů ukázaly, že implementace v rámci nástroje RE2 je rychleší než původní implementace v jazyce C#.
|
10 |
Evaluating User Experiences of Mockup Data created through RegexHelgesson, Emil January 2021 (has links)
The purpose of this study is to evaluate the possibility of having a library function capable of creating SQL inserts. The values for the inserts were created through regex. The study is conducted through a user study where the test participants tested three methods to create inserts for SQL, including the library function. The results show that the test participants performed on average the worst in terms of time while using the library function. When analysing the results manual insertion was preferred for a few inserts and the web-client was the preferred method for many inserts. This study indicated that the library function does not simplify the creation of SQL inserts under the circumstances of this study.
|
Page generated in 0.0378 seconds