Spelling suggestions: "subject:"similarity."" "subject:"imilarity.""
181 |
Textual Differences in Game Reviews Written by Men and WomenEriksson, Marie January 2011 (has links)
The aim of this essay is to examine the differences in language use between the genders in game reviews, to find whether there are differences in the use of the language depending on gender. Both sexist language and technical aspects are examined, the technical aspects of writing have been chosen from previous research about gendered differences in writing. The reviews are randomly chosen but the games are selected. There is an equal amount of games with male and female main characters, and the number of reviews is chosen according to the number of reviews written by females, as there are fewer of them, and thus easier to find a matching number of reviews written by males rather than vice versa. The reviews are then examined to find sexist language and differences. This essay finds that there is sexist language in the writing of both genders, such as marked language, but only when the main character of the game is female. Both genders tend to focus on the appearance of female characters and the characteristics of male characters, but there is no known previous research about male and female game characters to compare these results to. However, the technical differences remain consistent with previous research on the same subject, such as female reviewers using more pronouns than male reviewers, and male reviewers using fewer verbs than female reviewers.
|
182 |
Fractal Network Traffic Analysis with ApplicationsLiu, Jian 19 May 2006 (has links)
Today, the Internet is growing exponentially, with traffic statistics that mathematically exhibit fractal characteristics: self-similarity and long-range dependence. With these properties, data traffic shows high peak-to-average bandwidth ratios and causes networks inefficient. These problems make it difficult to predict, quantify, and control data traffic. In this thesis, two analytical methods are used to study fractal network traffic. They are second-order self-similarity analysis and multifractal analysis.
First, self-similarity is an adaptability of traffic in networks. Many factors are involved in creating this characteristic. A new view of this self-similar traffic structure related to multi-layer network protocols is provided. This view is an improvement over the theory used in most current literature.
Second, the scaling region for traffic self-similarity is divided into two timescale regimes: short-range dependence (SRD) and long-range dependence (LRD). Experimental results show that the network transmission delay separates the two scaling regions. This gives us a physical source of the periodicity in the observed traffic. Also, bandwidth, TCP window size, and packet size have impacts on SRD. The statistical heavy-tailedness (Pareto shape parameter) affects the structure of LRD. In addition, a formula to estimate traffic burstiness is derived from the self-similarity property.
Furthermore, studies with multifractal analysis have shown the following results. At large timescales, increasing bandwidth does not improve throughput. The two factors affecting traffic throughput are network delay and TCP window size. On the other hand, more simultaneous connections smooth traffic, which could result in an improvement of network efficiency. At small timescales, in order to improve network efficiency, we need to control bandwidth, TCP window size, and network delay to reduce traffic burstiness. In general, network traffic processes have a Hlder exponent a ranging between 0.7 and 1.3. Their statistics differ from Poisson processes.
From traffic analysis, a notion of the efficient bandwidth, EB, is derived. Above that bandwidth, traffic appears bursty and cannot be reduced by multiplexing. But, below it, traffic is congested. An important finding is that the relationship between the bandwidth and the transfer delay is nonlinear.
|
183 |
Code Classification Based on Structure SimilarityYang, Chia-hui 14 September 2012 (has links)
Automatically classifying malware variants source code is the most important research issue in the field of digital forensics. By means of malware classification, we can get complete behavior of malware which can simplify the forensics task. In previous researches, researchers use malware binary to perform dynamic analysis or static analysis after reverse engineering. In the other hand, malware developers even use anti-VM and obfuscation techniques try to cheating malware classifiers.
With honeypots are increasingly used, researchers could get more and more malware source code. Analyzing these source codes could be the best way for malware classification. In this paper, a novel classification approach is proposed which based on logic and directory structure similarity of malwares. All collected source code will be classified correctly by hierarchical clustering algorithm. The proposed system not only helps us classify known malwares correctly but also find new type of malware. Furthermore, it avoids forensics staffs spending too much time to reanalyze known malware. And the system could also help realize attacker's behavior and purpose. The experimental results demonstrate the system can classify the malware correctly and be applied to other source code classification aspect.
|
184 |
Background Knowledge, Category Labels, and Similarity JudgmentYu, Na-Yung 2010 August 1900 (has links)
Labels are one source of our judgments. By assigning labels to objects, we not
only create references but we also group prior and current experiences together. The goal
of this research is to investigate how labels influence our judgments. Previous research
on inductive generalization shows that labels can be more important than physical
characteristics (the labeling effect), but the mechanism for this effect remains unclear.
There are two differing views regarding the role of labels. One view proposes that labels
are not essentially different from physical features: shared labels increase overall
similarity between two items in the same way as shared physical features. The other
view suggests that people have a naïve theory that shared labels are more special than
shared physical features. The goal of this dissertation is to provide evidence that
complements these conflicting views. I suggest that the role of labels varies depending
on the background knowledge: types of categories (living things vs. man-made objects),
amount of knowledge (number of exemplars people could list for the category), and
types of labels (categorical vs. indexical). The results from four experiments showed
that, for living things, the labeling effect is strong and depends less on the amount of knowledge; for man-made objects, the labeling effect is weak and depends on the
amount of knowledge.
|
185 |
An ID-Tree Index Strategy for Information Filtering in Web-Based SystemsWang, Yi-Siang 10 July 2006 (has links)
With the booming development of WWW, many search engines have been developed to help users to find useful information from a great quantity of data. However, users may have different needs in different situations. Opposite to the Information Retrieval where users retrieve data actively, Information Filtering (IF) sends information from servers to passive users through broadcast mediums, rather than being searched by them. Therefore, each user has his (or her) profile stored in the database, where a profile records a set of interest items that can present his (or her) interests or habits. To efficiently store many user profiles in servers and filter irrelevant users, many signature-based index techniques are applied in IF systems. By using signatures, IF does not need to compare each item of profiles to filter out irrelevant ones. However, because signatures are incomplete information of profiles, it is very hard to answer the complex queries by using only the signatures. Therefore, a critical issue of the signature-based IF service is how to index the signatures of user profiles for an efficient filtering process. There are often two types of queries in the signature-based IF systems, the inexact filtering and the similarity search queries. In the inexact filtering, a query is an incoming document and it needs to find the profiles whose interest items are all included in the query. On the other hand, in the similarity search, a query is a user profile and it needs to find the users whose interest items are similar to the query user. In this thesis, we propose an ID-tree index strategy, which indexes signatures of user profiles by partitioning them into subgroups using a binary tree structure according to all of the different items among them. Basically, our ID-tree index strategy is a kind of the signature tree. In an ID-tree, each path from the root to a leaf node is the signature of the profile pointed by the leaf node. Because each profile is pointed only by one leaf node of the ID-tree, there will be no collision in the structure. In other words, there will be no two profiles assigned to the same signature. Moreover, only the different items among subgroups of profiles will be checked at one time to filter out irrelevant profiles for queries. Therefore, our strategy can answer the inexact filtering and the similarity search queries with less number of accessed profiles as compared to the previous strategies. Moreover, to build the index of signatures, it needs less time to batch a great deal of database profiles. From our simulation results, we show that our strategy can access less number of profiles to answer the queries than Chen's signature tree strategy for the inexact filtering and Aggarwal et al.'s SG-table strategy for the similarity search.
|
186 |
A HyBrid Approach-Based Signature Extraction Method for SimilarityYeh, Wei-Horng 18 July 2001 (has links)
A symbolic image database system is a system in which a large amount of image data and their related information are represented by both symbolic images and physical images. How to
perceive spatial relationships among the components in a symbolic image is an important criterion to find a match between the symbolic image of the scene object and the one being store as a modal in the symbolic image database. Spatial reasoning techniques have been applied to pictorial database, in particular those using 2D strings as an index representation have been successful. In this thesis, we extend the existing three levels of type-i similarity to more levels to aid similarity retrieval more precisely. There are 13 spatial operators which
were introduced by Lee and Hsu to completely represent spatial relationships in 1D space. But, they just combined the 13 spatial relationships on x- and y-axis to represent the spatial relationships in 2D space by 13 times 13 =
169 spatial relationships. However, the 169 spatial relationships are still not sufficient to show all kinds of spatial relationships between any two objects in 2D space. For example, the directional relationships, like North or South West, exist in 2D space and is difficult to be deducted from those 13 spatial operators. Thus, we add the nine directional relationships to the
169 spatial relationships in 2D space. In this way, we can distinguish up to 289 spatial relationships in 2D space. Moreover, in our proposed strategy, we also take care of the problem caused by the MBRs. In most of the previous approaches for iconic indexing, for simplifying the concerns, they apply the MBRs of two objects to define the spatial relationship
between them. The topological relationships, however, between objects can be quite different from the spatial relationship of their respective $MBR$s. Therefore, sometimes, it is hard to correctly describe the spatial relationship of the objects in terms of the relationships between their corresponding MBRs. To improve this drawback resulted from MBRs, we adopting the concept of topological relationships in our proposed strategy. Good access methods for large image databases are important for efficient retrieval. The signature files can be viewed as a preselection searching filter to prune off the unsatisfied images. In order to solve the ambiguity of the MBRs and to present the spatial
relationships in two dimensional space completely, we propose a hybrid approach-based signature extraction method for similarity retrieval. From our simulation study, we show that our approach can provide a higher rate of a correct match and requires a smaller storage cost than Lee et al.'s 2D B-based signature approach. In some case, the correct match rate based on our
proposed strategy can be up to 42.18%, while it is just 16.66% in Lee et al.'s strategy. Moreover, the worst case of the storage cost required in our proposed strategy is 1686 bits. But, it always needs 2015 bits in Lee et al.'s strategy.
|
187 |
Extracting relationships of research topics in information-related domain by analyzing thesisChen, Dao-hui 02 July 2003 (has links)
With the coming of knowledge management era, academic institutions also begin to engage in knowledge management (KM) activities, hoping that researchers can understand the relationship between research topics. However, most of the KM activities focusing on academic papers need research¡¦s effort to code and classify paper¡¦s content, and there is still no measurement of relationship between research topics from prior researches. Therefore, this thesis will propose a methodology to measure the relationship between research topics and grab the data of National Central Library from internet to construct a knowledge relationship system.
This system will analyze both dissertation¡¦s and thesis¡¦ content, such as keywords, abstracts, etc., and calculate two measurements that are relation strength and relation similarity to assess the direct and indirect relationship between two research topics. Moreover, this thesis found a phenomenon that there is high diversity of Chinese keyword¡¦s usage and the Chinese translation of English keyword. To overcome this incident, the database for Chinese keywords is built. This database will excerpt the mapping of Chinese keywords usage and its translation from the abstract of thesis. Finally, the trend of research topics in information-related domain using different aspects, such as different years, different schools and different departments are analyzed.
The result of analysis includes:
|
188 |
A Unique-Bit-Pattern-Based Indexing Strategy for Image Rotation and Reflection in Image DatabasesYeh, Wei-horng 16 June 2008 (has links)
A symbolic image database system is a system in which a large amount of image data and their related information are represented by both symbolic images and physical images. Spatial relationships are important issues for similarity-based retrieval in many image database applications. How to perceive spatial relationships among the components in a symbolic image is an important criterion to find a match between the symbolic image of the scene object and the one being store as a modal in the symbolic image database. With the popularity of digital cameras and the related image processing software, a sequence of images are often rotated or flipped. That is, those images are transformed in the rotation orientation or the reflection direction. A robust spatial similarity framework should be able to recognize image variants such as translation, scaling, rotation, and arbitrary variants. Current retrieval by spatial similarity algorithms can be classified into symbolic projection methods, geometric methods, and graph-matching methods. Symbolic projection could preserve the useful spatial information of objects, such as width, height, and location. However, many iconic indexing strategies based on symbolic projection are sensitive to rotation or reflection. Therefore, these strategies may miss the qualified images, when the query is issued in the orientation different from the orientation of the database images. To solve this problem, researchers derived the rule of the change of spatial relationships in image transformation, and proposed a function to map the spatial relationship to its related transformed one. However, this mapping consists of several conditional statements, which is time-consuming. Thus, in this dissertation, first, we classify the mapping into three cases and carefully assign a 16-bit unique bit pattern to each spatial relationship. Based on the assignment, we can easily do the mapping through our proposed bit operation, intra-exchange, which is a CPU operation and needs only the complexity of O(1). Moreover, we propose an efficient iconic index strategy, called Unique
Bit Pattern matrix strategy (UBP matrix strategy) to record the
spatial information. In this way, when doing similarity retrieval, we do not need to reconstruct the original image from the UBP matrix in order to obtain the indexes of the rotated and flipped image. Conversely, we can directly derive the index of the rotated or flipped image from the index of the original one through bit operations and the matrix manipulation. Thus, our proposed strategy can do similarity retrieval without missing the qualified database images. In our performance study, first, we analyze the time
complexity of the similarity retrieval process of our proposed strategy. Then, the efficiency of our proposed strategy according to the simulation results is presented. We show that our strategy outperforms those mapping strategies based on different number of objects in an image. According to the different number of objects in an image, the percentage of improvement is between 13.64% and 53.23%.
|
189 |
Reactions toward people with an illness : examining similarity as an extension to attribution theory /Clifford, Jeanie Marie. January 2004 (has links)
Thesis (Ph. D.)--University of California, San Diego, 2004. / Vita. Includes bibliographical references (leaves 99-103).
|
190 |
The effect of perceived entitativity on implicit image transfer in multiple sponsorshipsCarrillat, FrancoÌ?is Anthony 01 June 2005 (has links)
This dissertation proposes that in the case of multiple sponsorships (i.e., brands sponsoring concomitantly the same event), the group constituted by the sponsoring brands and the sponsored event will be perceived as an entity; a phenomenon that Campbell (1958) called entitativity. The extent to which a group of brands and a sponsored event is seen as being entitative will result in stereotypic processing of the group members (Brewer and Harasty 1996). Information about an entitative group is abstracted and used to form judgments about every group member (McConnell, Sherman, and Hamilton 1997). Characteristics tied to one brand or to the event will become associated to the other brands due to category-based information processing (Fiske and Neuberg 1990).
As a result, images associated with a brand or an event that belongs to an entitative group will be transferred to other brands of that group due to stereotyping.Image transfer effects were investigated through an experiment. Image transfer in sponsorship occurs primarily at an implicit level because sponsorship messages are subtle (Pham and Vanhuele 1997). As a consequence, the savings in relearning paradigm (Ebbinghaus 1885/1964) was the methodology used. It allows investigating implicit memory by comparing the recall of paired-associations between brands and image-traits across a multiple sponsorship and a no sponsorship condition. The findings confirmed that the event and the concomitant sponsoring brands were perceived as an entitative group, which resulted in an implicit transfer of image among the brands (Brand Image Transfer, BIT) as well as from the event to the brands (Event Image Transfer, EIT).
|
Page generated in 0.054 seconds