Global ETD Search

1	Evaluating Energy Efficiency of JAVA HashMap Mechanisms Jonsson, Theodor January 2024 (has links) This thesis investigates the energy efficiency of the Java collection HashMap withregard to insertions and lookups. Analyzing the default collision resolution technique which is a smart implementation of separate chaining, it also implementstwo other collision resolution techniques — double hashing and coalesced hashing — and compares the three in terms of their energy efficiency. The comparisons are done on their insertions and lookup algorithm both through an empirical study and of their time complexities. One of the findings of this thesis isthe preferred initial table size for energy efficiency which is between 2-5 times larger than the amount of insertions. The results show that during insertion thedefault implementation is more energy efficient especially at higher load factors. With lookups the coalesced hashing algorithm is more efficient but it is sucha small difference compared to the default implementation it is almost negligible. Overall the default implementation is the most energy efficient of the threeand it is not impacted much by load factor. These factors make the default implementation the preferable implementation for most applications, however incases where the load factor does not exceed 0.3, double hashing is the preferableoption as it consumes less energy than the other three. Hashmap hashtable energy energy efficiency Computer Sciences Datavetenskap (datalogi)
2	Smart Clustering System for Filtering and Cleaning User Generated Content : Creating a profanity filter for Truecaller / System för filtrering och sanering av oönskad text i användarskapat innehåll Moradi, Arvin January 2013 (has links) This thesis focuses on investigating and creating an application for filtering user-generated content. The method was to examine how profanity and racist expressions are used and manipulated to evade filtering processes in similar systems. Focus also went on to study different algorithms to get this process to be quick and efficient, i.e., to process as many names in the shortest amount of time possible. This is because the client needs to filter millions of new uploads every day. The result shows that the application detects profanity and manipulated profanity. Data from the customer’s database was also used for testing purposes, and the result showed that the application also works in practice. The performance test shows that the application has a fast execution time. We could see this by approximating it to a linear func-tion with respect to time and the number of names entered. The conclusion was that the filter works and discovers profanity not detected earlier. Future updates to strengthen the decision process could be to introduce a third-party service, or a web interface where you can manually control decisions. Execution time is good and shows that 10 million names can be pro-cessed in about 6 hours. In the future, one can parallelize queries to the database so that multiple names can be processed simultaneously. / Denna avhandling fokuserar på att utreda och skapa en applikation för filtrering av användargenererat innehåll. Metoden gick ut på att undersöka hur svordomar samt rasistiska uttryck används och manipuleras för att undgå filtrerings processer i liknande system. Fokus gick även ut på att studera olika algoritmer för att få denna process att vara snabb och effektiv, dvs kunna bearbeta så många namn på kortast möjliga tid. Detta beror på att kunden i detta sammanhang får in miljontals nya uppladdningar varje dag, som måste filtreras innan använding. Resultatet visar att applikationen upptäcker svordomar i olika former. Data från kundens databas användes också för test syfte, och resultatet visade att applikationen även fungerar i praktiken. Prestanda testet visar att applikationen har en snabb exekveringstid. Detta kunde vi se genom att estimera den till en linjär funktion med hänsyn till tid och antal namn som matats in. Slutsatsen blev att filtret fungerar och upptäcker svordomar som inte upptäckts tidigare i kundens databas. För att stärka besluten i processen kan man i framtida uppdateringar införa tredje parts tjänster, eller ett web interface där man manuelt kan styra beslut. Exekverings tiden är bra och visar att 10 miljoner namn kan bearbetas på cirka 6 timmar. I framtiden kan man parallellisera förfrågningarna till databasen så att flera namn kan bearbetas samtidigt. Java REST Jersey filter linear function MongoDB Maven String matching algorithm B-Tree Hashmap Aho-Corasick Engineering and Technology Teknik och teknologier

Search results

Evaluating Energy Efficiency of JAVA HashMap Mechanisms

Smart Clustering System for Filtering and Cleaning User Generated Content : Creating a profanity filter for Truecaller / System för filtrering och sanering av oönskad text i användarskapat innehåll