Spelling suggestions: "subject:"powerlaw"" "subject:"powerlaws""
101 |
Possible Difficulties in Evaluating University PerformanceBased on Publications Due to Power Law Distributions : Evidence from SwedenSadric, Haroon, Zia, Sarah January 2023 (has links)
Measuring the research performance of a university is important to the universities themselves, governments, and students alike. Among other metrics, the number of publications is easy to obtain, and due to the large number of publications each university produces during one year, it suggests to be one accurate metric. However, the number of publications depends largely on the size of the institution, suggesting, if not addressed, that larger universities are better. Thus, one might intuitively try to normalize by size and use publications per researcher instead. A better institution would allow individual researchers to have more publications each year. However, publications, like many other things, might follow a power-law distribution, where most researchers have few, and only a few researchers have very many publications. These power-law distributions violate the assumptions the central limit the orem makes, for example, having a well-defined mean or variance. Specifically, one can not normalize or use averages from power-law distributed data, making the comparison of university publications impossible if they indeed follow a power-law distribution. While it has been shown that some scientific domains or universities show this power-law distribution, it is not known if Swedish universities also show this phenomenon. Thus, here we collect publication data for Swedish universities and determine whether or not, they are power-law distributed. Interestingly, if they are, one might use the slope of the power-law distribution as a proxy to determine research output. If the slope is steep, it suggests that the ratio between highly published authors and those with few publications is small. Where as a flatter slope suggests that a university has more highly published authors than a university with a steeper slope. Thus, the second objective here is to assess if the slope of the distribution can be determined or to which extent this is possible. This study will show that eight of the fifteen Swedish universities considered follow a power-law distribution (Kolmogorov-Smirnov statistic<0.05), while the remaining seven do not. The key determinant is the total number of publications. The difficulty here is that often the total number of publications is so small that one can not reject a power-law distribution, and it is also impossible to determine the slope of the distribution with any accuracy in those cases. While this study suggests that in principle, the slopes of the power-law distributions can be used as a comparative metric, it also showed that for half of Sweden’s universities, the data is insufficient for this type of analysis.
|
102 |
Inferences on the power-law process with applications to repairable systemsChumnaul, Jularat 13 December 2019 (has links)
System testing is very time-consuming and costly, especially for complex high-cost and high-reliability systems. For this reason, the number of failures needed for the developmental phase of system testing should be relatively small in general. To assess the reliability growth of a repairable system, the generalized confidence interval and the modified signed log-likelihood ratio test for the scale parameter of the power-law process are studied concerning incomplete failure data. Specifically, some recorded failure times in the early developmental phase of system testing cannot be observed; this circumstance is essential to establish a warranty period or determine a maintenance phase for repairable systems. For the proposed generalized confidence interval, we have found that this method is not biased estimates which can be seen from the coverage probabilities obtained from this method being close to the nominal level 0.95 for all levels of γ and β. When the performance of the proposed method and the existing method are compared and validated regarding average widths, the simulation results show that the proposed method is superior to another method due to shorter average widths when the predetermined number of failures is small. For the proposed modified signed log-likelihood ratio test, we have found that this test performs well in controlling type I errors for complete failure data, and it has desirable powers for all parameters configurations even for the small number of failures. For incomplete failure data, the proposed modified signed log-likelihood ratio test is preferable to the signed log-likelihood ratio test in most situations in terms of controlling type I errors. Moreover, the proposed test also performs well when the missing ratio is up to 30% and n > 10. In terms of empirical powers, the proposed modified signed log-likelihood ratio test is superior to another test for most situations. In conclusion, it is quite clear that the proposed methods, the generalized confidence interval, and the modified signed log-likelihood ratio test, are practically useful to save business costs and time during the developmental phase of system testing since the only small number of failures is required to test systems, and it yields precise results.
|
103 |
Analyzing and Modeling Large Biological Networks: Inferring Signal Transduction PathwaysBebek, Gurkan January 2007 (has links)
No description available.
|
104 |
SLEEP-WAKE TRANSITION DYNAMICS AND POWER-LAW FITTING WITH AN UPPER BOUNDOlmez, Fatih 23 September 2014 (has links)
No description available.
|
105 |
Advances in simulation: validity and efficiencyLee, Judy S. 08 June 2015 (has links)
In this thesis, we present and analyze three algorithms that are designed to make computer simulation more efficient, valid, and/or applicable.
The first algorithm uses simulation cloning to enhance efficiency in transient simulation. Traditional simulation cloning is a technique that shares some parts of the simulation results when simulating different scenarios. We apply this idea to transient simulation, where multiple replications are required to achieve statistical validity. Computational savings are achieved by sharing some parts of the simulation results among several replications. We improve the algorithm by inducing negative correlation to compensate for the (undesirable) positive correlation introduced by sharing some parts of the simulation. Then we identify how many replications should share the same data, and provide numerical results to analyze the performance of our approach.
The second algorithm chooses a set of best systems when there are multiple candidate systems and multiple objectives. We
provide three different formulations of correct selection of the Pareto optimal set, where a system is Pareto optimal if it is not inferior in all objectives compared to other competing systems. Then we present our Pareto selection algorithm and prove its validity for all three formulations. Finally, we provide numerical results aimed at understanding how well our algorithm performs in various settings.
Finally, we discuss the estimation of input distributions when theoretical distributions do not provide a good fit to existing data. Our approach is to use a quasi-empirical distribution, which is a mixture of an empirical distribution and a distribution for the right tail. We describe an existing approach that involves an exponential tail distribution, and adapt the approach to incorporate a Pareto tail distribution and to use a different cutoff point between the empirical and tail distributions. Then, to measure the impact, we simulate a stable M/G/1 queue with a known inter-arrival and unknown service time distributions, and estimate the mean and tail probabilities of the waiting time in queue using the different approaches. The results suggest that if we know that the system is stable, and suspect that the tail of the service time distribution is not exponential, then a quasi-empirical distribution with a Pareto tail works well, but with a lower bound imposed on the tail index.
|
106 |
Étude par spectroscopie résolue en temps des mécanismes de séparation de charges dans des mélanges photovoltaïquesGélinas, Simon January 2009 (has links)
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal.
|
107 |
Security Analysis on Network Systems Based on Some Stochastic ModelsLi, Xiaohu 01 December 2014 (has links)
Due to great effort from mathematicians, physicists and computer scientists, network science has attained rapid development during the past decades. However, because of the complexity, most researches in this area are conducted only based upon experiments and simulations, it is critical to do research based on theoretical results so as to gain more insight on how the structure of a network affects the security. This dissertation introduces some stochastic and statistical models on certain networks and uses a k-out-of-n tolerant structure to characterize both logically and physically the behavior of nodes. Based upon these models, we draw several illuminating results in the following two aspects, which are consistent with what computer scientists have observed in either practical situations or experimental studies.
Suppose that the node in a P2P network loses the designed function or service when some of its neighbors are disconnected. By studying the isolation probability and the durable time of a single user, we prove that the network with the user's lifetime having more NWUE-ness is more resilient in the sense of having a smaller probability to be isolated by neighbors and longer time to be online without being interrupted. Meanwhile, some preservation properties are also studied for the durable time of a network. Additionally, in order to apply the model in practice, both graphical and nonparametric statistical methods are developed and are employed to a real data set.
On the other hand, a stochastic model is introduced to investigate the security of network systems based on their vulnerability graph abstractions. A node loses its designed function when certain number of its neighbors are compromised in the sense of being taken over by the malicious codes or the hacker. The attack compromises some nodes, and the victimized nodes become accomplices. We derived an equation to solve the probability for a node to be compromised in a network. Since this equation has no explicit solution, we also established new lower and upper bounds for the probability.
The two models proposed herewith generalize existing models in the literature, the corresponding theoretical results effectively improve those known results and hence carry an insight on designing a more secure system and enhancing the security of an existing system.
|
108 |
Complex systems and health systems, computational challenges / Systèmes complexes et systèmes de santé, défis calculatoiresLiu, Zifan 11 February 2015 (has links)
Le calcul des valeurs propres intervient dans des modèles de maladies d’épidémiques et pourrait être utilisé comme un allié des campagnes de vac- cination dans les actions menées par les organisations de soins de santé. La modélisation épidémique peut être considérée, par analogie, comme celle des viruses d’ordinateur qui dépendent de l’état de graphe sous-jacent à un moment donné. Nous utilisons PageRank comme méthode pour étudier la propagation de l’épidémie et d’envisager son calcul dans le cadre de phé- nomène petit-monde. Une mise en œuvre parallèle de méthode multiple de "implicitly restar- ted Arnoldi method" (MIRAM) est proposé pour calculer le vecteur propre dominant de matrices stochastiques issus de très grands réseaux réels. La grande valeur de "damping factor" pour ce problème fait de nombreux algo- rithmes existants moins efficace, tandis que MIRAM pourrait être promet- teuse. Nous proposons également dans cette thèse un générateur de graphe parallèle qui peut être utilisé pour générer des réseaux synthétisés distri- bués qui présentent des structures "scale-free" et petit-monde. Ce générateur pourrait servir de donnée pour d’autres algorithmes de graphes également. MIRAM est mis en œuvre dans le cadre de trilinos, en ciblant les grandes données et matrices creuses représentant des réseaux sans échelle, aussi connu comme les réseaux de loi de puissance. Hypergraphe approche de partitionnement est utilisé pour minimiser le temps de communication. L’al- gorithme est testé sur un grille national de Grid5000. Les expériences sur les très grands réseaux tels que Twitter et Yahoo avec plus de 1 milliard de nœuds sont exécutées. Avec notre mise en œuvre parallèle, une accélération de 27× est satisfaite par rapport au solveur séquentiel / The eigenvalue equation intervenes in models of infectious disease prop- agation and could be used as an ally of vaccination campaigns in the ac- tions carried out by health care organizations. The epidemiological model- ing techniques can be considered by analogy, as computer viral propagation which depends on the underlying graph status at a given time. We point out PageRank as method to study the epidemic spread and consider its calcula- tion in the context of small-world phenomenon. A parallel implementation of multiple implicitly restarted Arnoldi method (MIRAM) is proposed for calculating dominant eigenpair of stochastic matrices derived from very large real networks. Their high damp- ing factor makes many existing algorithms less efficient, while MIRAM could be promising. We also propose in this thesis a parallel graph gen- erator that can be used to generate distributed synthesized networks that display scale-free and small-world structures. This generator could serve as a testbed for graph related algorithms. MIRAM is implemented within the framework of Trilinos, targeting big data and sparse matrices representing scale-free networks, also known as power law networks. Hypergraph partitioning approach is employed to minimize the communication overhead. The algorithm is tested on a nation wide cluster of clusters Grid5000. Experiments on very large networks such as twitter and yahoo with over 1 billion nodes are conducted. With our parallel implementation, a speedup of 27× is met compared to the sequential solver
|
109 |
Aero acoustic on automotive exhaust systems / Aéroacoustiques des systèmes d’échappement automobileWiemeler, Dirk 08 March 2013 (has links)
Dans les systèmes d'échappement automobile, les sources de bruit d'origine aéro-acoustique représentent une partie importante du contenu fréquentiel, objectivement et subjectivement identifiable. De robustes procédures de tests ont été mises en place mais la simulation du contenu du bruit n'a pas encore fait ses preuves dans les processus de développement au quotidien. Cette thèse montre que le bruit aéro-acoustique provenant de sources type dipôle est dominant pour ce qui concerne les systèmes automobiles. La simulation des écoulements à l'origine de ces bruits spécifiques combinée avec les outils de calculs acoustiques classiques est très lourde voir tout simplement impossible. Le but de cette thèse est d'analyser la loi d'échelle pour des modèles de sources compactes, permettant de déterminer l'émission de la puissance acoustique selon différentes configurations géométriques "simples" et généralement répandues (par ex. tube perforé, diaphragme placé dans un tube…) basées sur des données empiriques. Il est démontré à l'aide de simulations que son utilisation est simple et que la précision de ces modèles de sources est satisfaisante si l'on ne s'écarte pas trop des géométries déjà analysées. / On automotive exhaust systems aero acoustic noise is a dominant and critical noise content, which is clearly objectively and subjectively detectable. Robust test procedures are established but the simulation of this noise content has not gained ground in the real life development processes. This thesis shows that the dominating characteristic of the aero acoustic noise of automotive systems is dipole noise. The simulation of these specific noise sources with classical computational areo acoustics is very cumbersome or even just impossible. The aim of the thesis is a review of the scaling law approach for compact source models, enabling the determination of the sound power emission of discret configurations based on empirical data. Application simulations show that the use of these source models is simple and that the accuracy is acceptable within the geometry limits analysed.
|
110 |
Leis de Escala nos gastos com saneamento básico: dados do SIOP e DOU / Scaling Patterns in Basic Sanitation Expenditure: data from SIOP and DOURibeiro, Ludmila Deute 14 March 2019 (has links)
A partir do final do século 20, o governo federal criou vários programas visando a ampliação de acesso ao saneamento básico. Embora esses programas tenham trazido o abastecimento de água potável e a coleta de resíduos sólidos para a maioria dos municípios brasileiros, o esgotamento sanitário ainda está espacialmente concentrado na região Sudeste e nas áreas mais urbanizadas. Para explicar esse padrão espacialmente concentrado, é frequentemente assumido que o tamanho das cidades realmente importa para o saneamento básico, especialmente para o esgotamento sanitário. De fato, à medida que as cidades crescem em tamanho, devemos esperar economias de escala no volume de infraestrutura de saneamento. Economias de escala na infra-estrutura implicam uma redução nos custos de saneamento básico, de forma proporcional ao tamanho da cidade, levando também a uma (esperada) relação de lei de escala (ou de potência) entre os gastos com saneamento básico e o tamanho da cidade. Usando a população, N(t), como medida do tamanho da cidade no momento t, a lei de escala para infraestrutura assume o formato Y(t) = Y0N(t)β onde β ≈ 0.8 < 1, Y denota o volume de infraestrutura e Y0 é uma constante. Diversas propriedades das cidades, desde a produção de patentes e renda até a extensão da rede elétrica, são funções de lei de potência do tamanho da população com expoentes de escalamento, β, que se enquadram em classes distintas. As quantidades que refletem a criação de riqueza e a inovação têm β ≈ 1.2 > 1 (retornos crescentes), enquanto aquelas responsáveis pela infraestrutura exibem β ≈ 0.8 < 1 (economias de escala). Verificamos essa relação com base em dados extraídos do Sistema Integrado de Planejamento e Orçamento (SIOP), que abrangem transferências com recursos não onerosos, previstos na Lei Orçamentária Anual (LOA), na modalidade saneamento básico. No conjunto, os valores estimados de β mostram redução das transferências da União Federal para saneamento básico, de forma proporcional ao tamanho dos municípios beneficiários. Para a dotação inicial, valores programados na LOA, estimado foi de aproximadamente: 0.63 para municípios com população superior a dois mil habitantes; 0.92 para municípios acima de vinte mil habitantes; e 1.18 para municípios com mais de cinquenta mil habitantes. A segunda fonte de dados identificada foi o Diário Oficial da União (DOU), periódico do governo federal para publicação de atos oficiais. Os dados fornecidos pelo DOU referem-se aos recursos não onerosos e também aos empréstimos com recursos do Fundo de Garantia por Tempo de Serviço (FGTS). Para extração dos dados textuais foram utilizadas técnicas de Processamento de Linguagem Natural(PLN). Essas técnicas funcionam melhor quando os algoritmos são alimentados com anotações - metadados que fornecem informações adicionais sobre o texto. Por isso geramos uma base de dados, a partir de textos anotados do DOU, para treinar uma rede LSTM bidirecional aplicada à etiquetagem morfossintática e ao reconhecimento de entidades nomeadas. Os resultados preliminares obtidos dessa forma estão relatados no texto / Starting in the late 20th century, the Brazilian federal government created several programs to increase the access to water and sanitation. However, although these programs made improvements in water access, sanitation was generally overlooked. While water supply, and waste collection are available in the majority of the Brazilian municipalities, the sewage system is still spatially concentrated in the Southeast region and in the most urbanized areas. In order to explain this spatially concentrated pattern it is frequently assumed that the size of cities does really matter for sanitation services provision, specially for sewage collection. As a matter of fact, as cities grow in size, one should expect economies of scale in sanitation infrastructure volume. Economies of scale in sanitation infrastructure means a decrease in basic sanitation costs, proportional to the city size, leading also to a (expected) power law relationship between the expenditure on sanitation and city size.Using population, N(t), as the measure of city size at time t, power law scaling for infrastructure takes the form Y(t) = Y0N(t)β where β ≈ 0.8 < 1, Y denotes infrastructure volume and is a constant. Many diverse properties of cities from patent production and personal income to electrical cable length are shown to be power law functions of population size with scaling exponents, β, that fall into distinct universality classes. Quantities reflecting wealth creation and innovation have β ≈ 1.2 > 1 (increasing returns), whereas those accounting for infrastructure display β ≈ 0.8 < 1 (economies of scale). We verified this relationship using data from federal government databases, called Integrated Planning and Budgeting System, known as SIOP. SIOP data refers only to grants, funds given to municipalities by the federal government to run programs within defined guidelines. Preliminary results from SIOP show decrease in Federal Grants to Brazilian Municipalities, proportional to the city size. For the initial budget allocation, β was found to be roughly 0.63 for municipalities above twenty thousand inhabitants; to be roughly 0.92 for municipalities above twenty thousand inhabitants; and to be roughly 1.18 for municipalities above fifty thousand inhabitants. The second data source is DOU, government journal for publishing official acts. DOU data should give us information not only about grants, but also about FGTS funds for basic sanitation loans. In order to extract data from DOU we have applied Natural Language Processing (NLP) tools. These techniques often work better when the algorithms are provided with annotations metadata that provides additional information about the text. In particular, we fed a database with annotations into a bidirectional LSTM model applied to POS Tagging and Named-entity Recognition. Preliminary results are reported in the paper
|
Page generated in 0.045 seconds